Tools for the Recognition of Sorting Signals and the Prediction ...

文章推薦指數: 80 %
投票人數:10人

Prediction of Subcellular Localization Sites for Eukaryotic Proteins ThisarticleispartoftheResearchTopic AnalysisofBioinformaticsToolsinSystemsGenetics Viewall 10 Articles Articles ShuaiChengLi CityUniversityofHongKong,HongKong,SARChina LITAOSUN SunYat-senUniversity,China MartiAldea InstituteofMolecularBiologyofBarcelona,SpanishNationalResearchCouncil(CSIC),Spain Theeditorandreviewers'affiliationsarethelatestprovidedontheirLoopresearchprofilesandmaynotreflecttheirsituationatthetimeofreview. AbstractIntroductionPredictionofSubcellularLocalizationSitesforBacterial/ArchaealProteinsPredictionofSubcellularLocalizationSitesforEukaryoticProteinsProteinLocalizationResourcesObtainedFromRecentSpatialProteomicsApproachesConclusionAuthorContributionsFundingConflictofInterestReferences SuggestaResearchTopic> DownloadArticle DownloadPDF ReadCube EPUB XML(NLM) Supplementary Material Exportcitation EndNote ReferenceManager SimpleTEXTfile BibTex totalviews ViewArticleImpact SuggestaResearchTopic> SHAREON OpenSupplementalData REVIEWarticle Front.Genet.,25November2020 |https://doi.org/10.3389/fgene.2020.607812 ToolsfortheRecognitionofSortingSignalsandthePredictionofSubcellularLocalizationofProteinsFromTheirAminoAcidSequencesKenichiroImai1andKentaNakai2*1CellularandMolecularBiotechnologyResearchInstitute,NationalInstituteofAdvancedIndustrialScienceandTechnology(AIST),Tokyo,Japan2TheInstituteofMedicalScience,TheUniversityofTokyo,Tokyo,Japan Atthetimeoftranslation,nascentproteinsarethoughttobesortedintotheirfinalsubcellularlocalizationsites,basedonthepartoftheiraminoacidsequences(i.e.,sortingortargetingsignals).Thus,itisinterestingtocomputationallyrecognizethesesignalsfromtheaminoacidsequencesofanygivenproteinsandtopredicttheirfinalsubcellularlocalizationwithsuchinformation,supplementedwithadditionalinformation(e.g.,k-merfrequency).Thisfieldhasalonghistoryandmanypredictiontoolshavebeenreleased.Eveninthiseraofproteomicatlasatthesingle-celllevel,researcherscontinuetodevelopnewalgorithms,aimingataccessingtheimpactofdisease-causingmutations/celltype-specificalternativesplicing,forexample.Inthisarticle,weoverviewtheentirefieldanddiscussitsfuturedirection. Introduction Althoughweshouldnotunderestimatetheimportanceofnon-codinggenes,themainplayersofthegeneticsystemoflivingorganismsarestillregardedasprotein-codinggenes,whichspecifyaminoacidsequenceinformation.Thus,inprinciple,weshouldbeabletoinfertheinvivofateofanyproteinfromitsaminoacidsequence,ifitsenvironmentalconditions,suchasthecelltypewhereitissynthesized,areappropriatelygiven.Forexample,weshouldbeabletopredictthethree-dimensionalstructureofaproteinfromitssequenceortodesignnovelaminoacidsequencesthattakeadesiredthree-dimensionalstructure(Baker,2019),aswellastopredicthowitbinds/interactswithotherproteins/smallmoleculeligands(Vakser,2020).Anotherimportantinformationtobepredictediswhichkindofpost-translationalmodifications,ifany,itwilltake[atwhichresidue(s);AudagnottoandDalPeraro,2017].Also,itmaybepossibletopredictthehalf-lifeofagivenprotein/peptide-basedonthedegradationsignals(degrons)and/orotherproperties(Mathuretal.,2018;Eldeebetal.,2019).Finally,thepredictionofsubcellularlocalizationofaproteinbasedonitsaminoacidsequenceisachallengingfieldinbioinformatics.Itiswellacceptedthattheproteinsortingforsubcellularlocalizationisregulatedbyso-calledproteinsorting(ortargeting)signals,whicharetypicallyrepresentedasashortstretch(es)ofitsaminoacidsequence.Nowadays,manyoftheproteinlocalizationmechanisms/pathwaysthatrecognizeandutilizesuchsignalshavebeenclarified.Therefore,manypredictorshavebeendevelopedfortherecognitionofsuchsortingsignalsandattemptshavebeendonetocombinesuchpredictors,leadingtothecomprehensivepredictionofthefinallocalizationsite.However,notallsuchsignalshavebeenclarified.Moreover,notallproteinsareequippedwithsuchtypicalsignalsandusesomealternative(minor/exceptional)pathways.Addingtheknowledgeofsuchexceptionalcaseswillmakethepredictionsystemgraduallymorerealisticbuttheobjectiveassessmentofitsperformance,liketheonescommonlyusedinthefieldofmachinelearning,willbecomedifficultbecausetheknowledgeofexceptionalcasesarequiteunlikelytobegeneralized(inotherwords,anysequencefeaturesofsuchexceptionalproteins,whicharenothingtodowiththeirsortingmechanisms,wouldworkascluesfortheirprediction).Itshouldbealsonotedthatthepracticalvalueofsubcellularlocalizationpredictorshasbeendegradedbecausethelocalizationinformationisbeingcomprehensivelydeterminedwithsubcellularproteomicsexperiments(HarveyMillarandTaylor,2014).However,theriseofsyntheticbiologyaswellasprecisionmedicinewilldemandpredictiontoolsthatenablethepredictionagainstartificialproteinsand/orthepredictionoftheimpactofmutations/polymorphicvariationsonpotentialsortingsignals. Inthisreviewarticle,wewillintroducetheoutlineofthisfield,emphasizingitsrecentprogress.Thereadersarerecommendedtorefertoadditionalreviewsbyotherauthorsandourselves,too(ImaiandNakai,2010,2019;DuandXu,2013;Nielsen,2017;Nielsenetal.,2019). PredictionofSubcellularLocalizationSitesforBacterial/ArchaealProteins Eveninthesimplesttypeoforganisms,whichareunicellularorganismswithoutanysubcellularcompartments,proteinscanbelocalizedateitherthecytoplasmicspace,thecellularmembrane,ortheextracellularspace(i.e.,secreted).Thisisbasicallythecaseforso-calledGram-positivebacteriaandarchaea,but,inreality,theyalsohaveacellwallforanotherlocalizationsite.Thebasicpredictionstrategyfortheseproteinsistocombinetwokindsofpredictors:apredictorforN-terminalsignalpeptidesandthatfortransmembranesegments.Namely,aproteinthatneitherhasanN-terminal(andcleavable)signalpeptidenoranyhydrophobictransmembranesegment(s)ispredictedtobelocalizedatthecytoplasmicspace;aproteinthathasanytransmembranesegment(s)(includinganN-terminaluncleavablesegment)ispredictedtobelocalizedatthecellularmembrane;andfinally,aproteinthathasacleavableN-terminalsignalpeptidebutdoesnothaveanytransmembranesegment(s)ispredictedtobesecretedtotheextracellularspaceortobelocalizedatthecellwall.InGram-positivebacteria,proteinsthatareanchoredtothecellwallarecharacterizedwiththeexistenceoftheLPXTG-motif,followedbyahydrophobicdomainandatailofpositively-chargedresidues(forrecentreview,seeSiegeletal.,2017).Ontheotherhand,Gram-negativebacteriacontainonemoremembrane,theoutermembrane,insteadofthecellwall.Therefore,theirpossiblelocalizationsitesarethecytoplasmicspace,theinnermembrane(whichisequivalenttothemembraneofGram-positivebacteria),theperiplasm,theoutermembrane,andtheextracellularspace.Generallyspeaking,proteinsthatarelocalizedatthelatterthreesites(theperiplasm,theoutermembrane,andtheextracellularspace)haveanN-terminalcleavablesignalpeptidebutdonothaveanyhydrophobictransmembranesegment(s).Proteinsthatareintegratedintotheoutermembranearetypicallyβ-barrelproteins(Bakelaretal.,2017).Todistinguishthesethreetypesofproteins,theirdifferenceinaminoacidcompositionand/ork-merfrequencyaswellasmotif/homology-basedmethodsareoftenused. Apioneeringworktoproposetheaboveformalismispublishedin1991(NakaiandKanehisa,1991),wherethepredictorwasnamedPSORT(I).In2003,itsapproachwasinheritedandelaboratedbyFionaBrinkman’sgroup(Gardyetal.,2003);theirsoftwareisnamedPSORTb(orPSORT-B).ItslatestversionisPSORTb3.0(Yuetal.,2010).Thegrouppublishedanexcellentreviewofbacterialproteinsubcellularlocalizationin2006(GardyandBrinkman,2006).Accordingtotheassessmentshowninthereview,PSORTbwasthebestpredictoratthattime.ThegroupalsoreleasesPSORTdb,whichcontainsacollectionofexperimentally-determinedinformationofsubcellularlocalizationaswellassystematicoutputsofPSORTbappliedtothousandsofbacterialproteomes[itslatestreferencereportsv.3.0:(Peabodyetal.,2016)butitslatestversionisv.4.0].ThesamegroupalsoproposesPSORTm,avariantofPSORTbdesignedforthepredictionofmetagenomicdata(Peabodyetal.,2020).ThebasicideaofPSORTmistofirstidentifythetaxonomyofeachreadbasedonareferencedatabaseofmicrobialproteins.Fromtheestimatedtaxonomy,thereadisautomaticallyclassifiedwithcellenvelopetypesandthenitissubjecttoavariantofPSORTb,whichusesvarioustypesofanalyses(suchasmotif/profileanalysis)foritssubcellularlocalizationprediction.Althoughtheassessmentofitspreciseaccuracywouldbedifficult,theyreportanassessmentusingartificialdataandthecomparisonwiththepredictionagainstpre-assembleddata.Inviewoftherapidgrowthofmicrobiomeanalyses,theneedofcharacterizingmetagenomedatashouldincreaseevenmoreandthusthefieldlookspromising.Ofcourse,othergroupshavedevelopedavarietyofpredictorsforbacterial/archaealproteins,amongwhichPSO-LocBact(Lertampaipornetal.,2019),GPos-ECC-mPLoc/Gneg-ECC-mPLoc(Wangetal.,2015),BUSCA(Savojardoetal.,2018b),whichwillbeintroducedbelow,andClubSub-P(ParamasivamandLinke,2011)arereleasedrelativelyrecently.Someofthemclaimthattheycandealwithproteinswithmultiple-locations.Althoughonceadatabasefor(eukaryotic)proteinswithmultiplesubcellularlocalizationsisreleased(Zhangetal.,2008),itstillseemsdifficulttoclassifymultiplelocalizationsobjectivelyandquantitativelybecausethedatacomefromdifferentsourceswhichrelyondifferentexperimentalconditions(butseethediscussionbelow). Beyondthebasicschemedescribedabove,thereareseveralissuestobefurtherexplored.Oneisthepredictionofseveralspecializedlocalizationsites,suchashost-associated,typeIIIsecretion,fimbrial,flagellar,andspore.InPSORTb,theyaretreatedassubcategories.Ofcourse,itisfavorablethatapredictorcandealwithsuchlocalizationsitesbutitisquestionableifsuchapredictorcanalsodealwithartificialproteinsthataretransportedtosuchlocations.Inotherwords,itislikelythatsuchpredictionsareeasilydonewithsimplehomologytransferfromknownexamples.Anotherissueishowtodealwiththeproteinsthataretransportedwithminorpathways.Fortheusers’convenience,itisdesirablethatapredictorcaninformuserswhichpathwaytheinputproteinwilluse.Forexample,itissurelyusefulifapredictorinformsusthattheinputproteinwillbetransportedviathetwin-argininetranslocationpathway(PalmerandStansfeld,2020)orthelipoproteinsignalpeptidaseII-dependentpathway(ElArnaoutandSoulimane,2019).Thiscanalreadybedonewithseveralpredictors,includingSignalP-5.0(AlmagroArmenterosetal.,2019,seebelow).Hopefully,moreknowledgeofvariousproteinsortingpathwaysshouldbeincorporatedintopredictors,eveniftheobjectiveassessmentoftheirpredictabilitywouldbecomedifficult.Inthissense,morebenchmarkingefforts/systematicanalysisofsubcellularlocalizationfromvariousviewpointswouldbevaluable(Stekhovenetal.,2014;OrioliandVihinen,2019;seebelow). PredictionofSubcellularLocalizationSitesforEukaryoticProteins Sofar,manypredictionmethodsofeukaryoticproteinsubcellularlocalizationhavebeendeveloped.Theyaremainlybasedonbiological/empiricalsequencefeaturesrelatedtosubcellularlocalization.Inthesemethods,avarietyofmachinelearningalgorithms,suchasthek-nearestneighbor(k-NN)classifier,theRandomForestclassifier,thesupportvectormachine(SVM),andthedeeplearning,havebeenused.Thosemethodsusuallytarget10mainlocalizationsites,wheresubcompartmentsoflocalizationsitesaremergedinto10majorsitesinordertoincreasethenumberofproteinsperlocalizationsite(seeTable1).Asfurtherexplainedbelow,forthepredictionofsubcellularlocalizationsites,threetypesofpredictionfeaturesaregenerallyused:targetingsignalfeatures,sequence-basedfeatures,andannotation-basedfeatures(Figure1).Thefeaturesassociatedwithtargetingsignalsaremostpowerful,whenavailable,andmanysubcellularlocalizationpredictorsbasedontargetingsignalfeatureshavebeendeveloped.Thus,wefirstoverviewtherepresentativetargeting-signalpredictorsandthenpredictorsforlocalizationsites. TABLE1 Table1.Representativesubcellularlocationscoveredbypredictorsforeukaryoticproteins. FIGURE1 Figure1.Summaryofrepresentativepredictionapproachesofdifferentsubcellularlocalization. PredictionofTargetingSignals Thetargetingsignalsareroughlygroupedintotwocategories:N-terminaltargetingsignalsandnon-N-terminaltargetingsignals.Themitochondrialtargetingsignal(presequences),thesignalsequenceforthesecretorypathway(signalpeptides),andthetransitsignalforchloroplast(transitpeptides)arewell-knownasN-terminaltargetingsignals,whilethenuclearlocalizationsignal(NLS)andthenuclearexportsignal(NES)areinternalsignalsequences.Peroxisomematrixproteinscontainperoxisomaltargetingsignaltype1(PTS1)intheC-terminus. PredictionofMitochondrialTargetingSignal Mitochondriahavebeenestimatedtohost1,000to1,500distinctproteins.Approximately,99%ofmitochondrialproteinsareencodedinthenucleargenomeandareimportedbytranslocasesinthemitochondrialouterandinnermembranes.Approximately60%ofmitochondrialproteinspossessanN-terminalcleavabletargetingsignal(presequence;Vögtleetal.,2009).Thesepresequencesaretypicallyrecognizedbythetranslocaseoftheoutermembrane(TOM)receptors,whichconsistofTom20andTom22,intheTOMcomplex.Then,theydirectthetranslocationofsignal-containingproteinsthroughthemainproteintranslocationchannel,Tom40(Pfanneretal.,2019).Upontranslocationacrosstheoutermembrane,thepresequence-containingproteinsaretransferredacrosstheinnermembranebythetranslocaseoftheinnermembranecomplex(TIM23)withthepresequencetranslocase-associatedmotor(PAM).Thelengthofpresequencesis20–60aminoacidresidues(Calvoetal.,2017).Therepresentativefeaturesofpresequencesarehighandlowcompositionofarginineresiduesandnegatively-chargedresidues,respectively(vonHeijne,1986;Schneideretal.,1998).Positivelychargedamphiphilicity(amphiphilicα-helicalstructurewithhydrophobicresiduesononefaceandpositively-chargedresiduesontheoppositeface)isalsoawell-characterizedfeature(Chacinskaetal.,2009;Fukasawaetal.,2015).Recently,theTOMcomplexstructurewasrevealedbycryo-electronmicroscopyanditprovidedstructuralinsightsintotheimportpathofprecursorproteincontainingpresequencethroughtheTOMcomplex(Araisoetal.,2019).Presequenceistypicallycleavedbythreemitochondrialpeptidasesinthematrix(MPP,Icp55,andOct1;Mossmannetal.,2012).ThecleavagebyMPPoccursafterthepositionoftwoaminoacidsofC-terminaltoanarginine(theR-2motif).Icp55andOct1subsequentlycleaveoffoneaminoacidandeightaminoacidsfromthenewly-emergedN-terminus,respectively.Therefore,proteinsprocessedbyMPPandIcp55haveanarginineatposition-3(theR-3motif)inthepresequence,whileproteinsprocessedbyMPPandOct1haveanarginineatposition-10(theR-10motif). MitoProtII(Claros,1995),TargetP(Emanuelssonetal.,2000),Predotar(Smalletal.,2004),TPpred3.0(Savojardoetal.,2015),andMitoFates(Fukasawaetal.,2015)werewidelyusedpresequencepredictionmethods.Thosearedevelopedusingmachine-learningtechniqueswiththesefeaturesofpresequences.Thosetoolsarealsocapableofpredictingtheexistenceofpresequenceaswellastheircleavagesite.MitoProtIIandMitoFatesarespecificpredictorsfor(mitochondrial)presequences,whileTargetP,Predotar,andTPpred3.0canalsopredictotherN-terminaltargetingsignals,suchassecretorysignalsequenceandchloroplastictargetingsignal.Recently,TargetP2.0isdevelopedasadeeplearningmodel,usingbidirectionallong-short-termmemory(LSTM)andamulti-attentionmechanism(Armenterosetal.,2019).Amongexistingtools,threeofthem(MitoFates,TPpred3.0,andTargetP2.0)performbetterinthepredictionofboththepresequenceexistenceanditscleavagesite.MitoFatesemploysanSVMclassifierbycombiningaminoacidcompositionandphysicochemicalpropertieswithpositivelychargedamphiphilicity,discoveredpresequencemotifs,andposition-weightmatricesofcleavagesitepatterns.TPpred3.0isacombinationofaGrammaticalRestrainedHiddenConditionalRandomField,N-to-1ExtremeLearningMachines,andSVMs.Wecomparedtheperformanceofthosethreemethods,usingrecentproteomicdataoftheN-terminiofmousemitochondrialproteins(weomittedproteinswhoselengthofcleavedN-terminalsequencesisshorterthan10orlongerthan100aminoacidsinthecomparison;Calvoetal.,2017).TherecallsofpresequencepredictionbyTPpred3.0,MitoFates,andTargetP2.0are63.2,75.9,and79.9%,respectively.WhereastherecallsofthecleavagepredictionbyTPpred3.0,MitoFates,andTargetP2.0are27.0,28.8,and45.5%,respectively.MitoFatesandTargetP2.0showbetterperformanceonthepresequenceprediction.Inthecleavagesiteprediction,TargetP2.0faroutperformedothermethods,thoughthecleavagesitepredictionisstillachallengingtask.About20%ofmousecleavagesitedatadoesnotmatchwiththeR-2,R-3,andR-10motifs(Calvoetal.,2017).Itwillbenecessarytobettercharacterizetheseuntypicalpresequences. PredictionofSignalSequence Thetargetingsignalsequenceforthesecretorypathway(signalpeptides)islocatedattheN-terminalofproteinsequenceinbotheukaryotesandprokaryotes.Thelengthofsignalpeptidesis16–30aminoacidresidues.Itisestimatedthatabout10–20%ofeukaryoticproteomeand10%ofbacterialproteomehavethesignalpeptideatN-terminus(Kanapinetal.,2003;Ivankovetal.,2013).Ineukaryoticcells,thesignalrecognitionparticle(SRP)co-translationallyrecognizessignalpeptidesupontheiremergencefromtheribosomeandtransfersthemtotheSec61transloconintheendoplasmicreticulum(ER)membraneviatheSRPreceptor(Nilssonetal.,2015).Thesignalpeptidasecleavesoffsignalpeptidesandthusmatureproteinsaregenerated.Signalpeptidesshareseveralcharacteristicfeatures(vonHeijne,1990);theyhavetripartitearchitecture:apositivelychargedN-terminus(n-region),ahydrophobicsegment(h-region),andacleavagesiteforsignalpeptidase(c-region).Thecleavagesiteischaracterizedbythe(-1,-3)rule;aminoacidswithsmall,unchargedsidechainsatthe-1and-3positionrelativetothecleavagesite. Forpredictingsignalpeptidesandtheircleavagesites,manypredictionmethods,suchasSignalP4.0(Petersenetal.,2011),SPEPlip(Farisellietal.,2003),Phobius(Kroghetal.,2007),andDeepSig(Savojardoetal.,2018a),havebeendeveloped.Thediscriminationbetweensecretoryandnon-secretoryproteinsbasedonthesignalpeptidepredictionhasbeenmostsuccessfulintargetingsignalpredictionsbecauseSignalP3.0hasalreadyachievedthebestMatthews’CorrelationCoefficient(MCC)of0.76ineukaryoticdatasetsinanassessmentstudyin2009(Chooetal.,2009).Recently,SignalPhasbeenfurtherimprovedasadeepneuralnetwork-basedmethod,combiningwithconditionalrandomfieldclassificationandoptimizedtransferlearning(SignalP-5.0;AlmagroArmenterosetal.,2019).Accordingtotheirbenchmarkresults,SignalP-5.0outperformsothermethodsinpredictingboththesignalpeptideexistenceandthecleavagesite:theMCCwas0.88inthesignalpeptidepredictionandtherecallofcleavagesitedetectionwas72.9%. PredictionofChloroplasticTargetingSignal Thetransloconsattheouterandtheinnermembranesofchloroplasts,theTOCandTICcomplexesmediatethetargetingandimportof~3,500differentnuclear-encodedproteins.Thoseproteinsareimportedfromthecytoplasmviainteractionbetweentheircleavable,N-terminalchloroplasttargetingsignal(transitpeptides),andtheTOC–TICimportsystems(LiandChiu,2010;Pailaetal.,2015).Thetransitpeptideisremovedoffbytheactivityofstromaprocessingpeptidase(SPP),whichisrelatedtothemitochondrialpeptidase,MPP.SPPdoesnotinteractstablywiththeTOC–TICimportsystem,thusthecleavageeventoccursafterproteintranslocationorupontheemergenceofthetransitpeptidecleavagesiteintothestroma.Chloroplasttransitpeptidesaremostlyunstructuredbutcanformα-helicalstructuresinhydrophobicenvironments(Bruce,2001;Jarvis,2008).Inaddition,chloroplasttransitpeptideshaveahighcontentofhydroxylatedaminoacids(e.g.,serineresidues)andpositivelychargedaminoacidsandaverylowcontentofnegativelychargedaminoacids(Bhushanetal.,2006).Transitpeptidesandpresequencesarethereforesimilarinseveralaspects.Inspiteofthesimilarities,chloroplasttransitpeptidesdirectprecursorproteinsspecificallytochloroplasts.Geetal.(2014)demonstratedthattransitpeptidesandpresequencescanbediscriminatedbytheirchargepropertiesandhydrophobicity.Also,theanalysisof916chloroplastproteinsrevealedanN-terminaldomainbeginningwithMet-AlaandthelowcompositionofarginineintheN-terminalportion(Zybailovetal.,2008).Moreover,Leeetal.(2019)recentlyshowedthatmitochondrialorchloroplasttargetingspecificitiesarecharacterizedbytheN-terminalregionsofthesetargetingsignals:anN-terminalmultiple-argininemotifwasidentifiedasthemitochondrialspecificityfactorandchloroplastevasionsignal.CleavagesitesoftransitpeptidesarecharacterizedbyhighercontentofAla,Ile,Cys,andValresidues(GavelandvonHeijne,1990).Thethreemotifs,[V,I][R,A]↓[A,C]AAE,S[V,I][R,S,V]↓[C,A]A,and[A,V]N↓A[A,M]AG[E,D],arederivedbyasetof198cleavagesites(Savojardoetal.,2015). TheexistingpredictiontoolsforthechloroplastictargetingsignaldealwithcleavableN-terminaltransitpeptides.WidelyusedpredictionmethodshavebeenintegratedasapartofpredictionofN-terminaltargetingsignalsingeneral:e.g.,TargetP(Emanuelssonetal.,2000),iPSORT(Bannaietal.,2002),Predotar(Smalletal.,2004),andTPpred3(Savojardoetal.,2015).Amongthosetools,TPpred3achievedbetterperformancefortransitpeptideprediction(46%precisionand64%recall).Asmentionedabove,TargetPisrecentlyupdatedtoversion2.0asadeeplearningmodel(TargetP2.0;Armenterosetal.,2019).Intheircomparison,theprecisionandrecallofchloroplastictransitpeptideidentificationofTargetP2.0are90and86%,respectively,whilethoseofTPpred3are76and69%.Inthecleavagesiteprediction,therecallsofTargetP2.0andTPpred3are49and30%,respectively.Likemitochondrialpresequenceprediction,thecleavagesitepredictionofchloroplastictargetingsignalisadifficultproblem.Comparingwiththedatasizeofsignalpeptides,thatoftransitpeptidesisquitesmallandthusthelowerperformancecouldhavebeencausedbythisreason.Larger-scaleN-terminalproteomicsdataofchloroplasticproteinswouldbenecessaryfortheimprovementoftheircleavagesiteprediction. PredictionofNuclearLocalizationSignalsandNuclearExportSignals Nuclearproteinsaretransportedintooroutofthenucleithroughthenuclearporecomplexbytheimportin-β(Impβ)familynucleocytoplasmictransportreceptors(KimuraandImamoto,2014).Thehumanproteomecontains20Impβfamilyproteins:10arenuclearimportreceptors(importin-β,transportin-1,-2,-SR,importin-4,-5(RanBP5),-7,-8,-9and-11),sevenareexportreceptors(exportin-1(CRM1),-2(CAS/CSE1L),-5,-6,-7,-t,andRanBP17),twoarebi-directionalreceptors(imporin-13andexportin-4),whilethefunctionofremainingRanBP6isundetermined(KimuraandImamoto,2014).Thosenucleocytoplasmictransportreceptorsarethoughttorecognizespecifictargetingsignalsonthosecargoproteins.SeveraltypesofNLSsandNESshavebeenreported,sofar.ThemoststudiedNLSistheclassicalNLS(cNLS)thatbindstoImpα,whichisacargo-bindingadaptorexclusivelyusedforImpβ(Langeetal.,2007).SequencessimilartotheImpβbinding(IBB)-domaininImpαactasNLSsthatbinddirectlytoImpβ.OtherknownNLSs/NESsthatbinddirectlytoImpβfamilyare:thePY-NLSforTrn1andTrn2(Leeetal.,2006),theLeu-richNESforCRM1(HuttenandKehlenbach,2007),theSR-domainforTrnSR(Maertensetal.,2014),andtheβ-likeimportinbinding(BIB)-domain,whichbindstoseveralnucleocytoplasmictransportreceptors(JäkelandGörlich,1998).Inaddition,theRG/RGG-richsegmentforTrn1andtheRSY-richsegmentforTrnSRwerereportedrecently(Bourgeoisetal.,2020).However,theseknownNLSs/NESsdonotexplainallofthecargorecognitionsites.Moreover,recentproteomicanalysisfortheidentificationofcargoproteinsof12nucleocytoplasmictransportreceptors(10nuclearimportreceptorsand2bi-directionalreceptors;Kimuraetal.,2017)alsopointedoutthatabout30%ofidentifiedcargosaresharedbymultiplereceptors.Thedegreeofmultiplicityanddiversityofcargorecognitionbynucleocytoplasmictransportreceptorsarestillcontroversial. Amongknownnucleartargetingsignals,cNLSandNESofCRM1arewellcharacterized.Thus,existingpredictionmethodsofNLSsandNESsmainlytargetthesetwotypes.cNLSsaregroupedintomonopartiteandbipartiteNLSs.MonopartiteNLSischaracterizedwithasinglestretchofbasicresidues(e.g.,KR[K/R]RandK[K/R]RK),whilebipartiteNLShastwoclustersofbasicresidues,separatedbyaspacerregionof10–12aminoacids(e.g.,KRX10–12K[K/R][K/R];Kosugietal.,2009).Lisitsynaetal.(2017)assessedthepredictionperformanceofwidelyusedmethods,Nucpred(Brameieretal.,2007),cNLSmapper(Kosugietal.,2008a),NLStradamus(Baetal.,2009),NucImport(Mehdietal.,2011),andSeqNLS(LinandHu,2013),usingahumanNLSdataset(Lisitsynaetal.,2017).NucPred,seqNLS,andNLStradamusshowedbetterMCCs(~0.3);however,therecallsofthosemethodswerestill~45%.Recently,Guoetal.(2020)reportedINSP,whichisaNLSpredictorbasedonamultivariateregressionmodelintegratingPSSM-basedconservationscore,proteinlanguage-basedSVMlearningscore,disorder-basedstructuralscore,andaminoacidphysicalchemistryproperty-basedscore.Ontheirtestdataset,INSPshowed50.6%precisionat67.0%recall,whereasseqNLS,NLStradamus,andcNLSmapperobtained60.6%precisionat36.4%recall,53.9%precisionat35.6%recall,and50.9%precisionat50.9%recall,respectively.INSPshowedafavorablebalancebetweenthepredictionprecisionandrecall,butNLSpredictionseemstobestilldifficultbecausethecNLSsequencepatternsareoftenobservedinnon-nuclearproteinsequences(i.e.,falsepositives). NuclearexportsignalsfunctionasessentialregulatorsfortheexportofhundredsofdistinctcargoproteinsbyinteractingwithCRM1.Sofar,11consensuspatternsofNEShavebeenproposedbyapeptide-librarystudyandstructureanalysesofCRM1-NES(Kosugietal.,2008b;Fungetal.,2015,2017).Ingeneral,NESsarerepresentedbyΦ0-x1-2-Φ1-(x)2-3-Φ2-(x)2-3-Φ3-x-Φ4(Φ1-4denoteLeu,Val,Ile,Phe,andMetwhilexisanyaminoacid.Φ0isnotrestrictedtothehydrophobicaminoacids).ThosehydrophobicresiduesinΦ0–Φ4areboundtothecorrespondinghydrophobicpocketsinCRM1.BasedonthepatternoftheseΦ’sandspacingsequences,theNESmotifsareclassifiedintosevenclassesandfouradditionalreverseclasses,representingbindingintheoppositedirection.SeveralpredictiontoolsforNESs,suchasNetNES(LaCouretal.,2004),NESsential(Fuetal.,2011),NESmapper(Kosugietal.,2014),Wregex(Prietoetal.,2014),LocNES(Xuetal.,2015),andNoLogo(Likuetal.,2018)havebeendeveloped,representingtheconsensussequenceswithregularexpressionsorPSSMsaswellasbiophysicalproperties(disorderpropensity,solventaccessibility,andsecondarystructureinformation).Amongthosetools,LocNESoutperformedotherpredictiontools;however,theprecisionis~50%at20%recall.Thelowperformanceiscausedbyhighfalse-positiverates.Asmentionedabove,theNESconsensuspatternsaresimpleandcommonlyobservedinotherproteinsequences.Thus,itseemstobedifficulttoimprovethepredictionperformancebyonlyusingthesequenceinformation.Recently,Leeetal.(2019)providedacomprehensivetableforcargoproteins,containingthelocationoftheNESmotifswiththedisorderedpropensity,thepredictedsecondarystructures,andtheconserveddomaininformation.Theyalsoproposedastructuremodeling-basedpredictionwhichpredictsthebindingenergyoftheNESpeptideboundtothebindinggrooveofCRM1,usingmultiplestructuresofCRM1-NESpeptidecomplexastemplates(Leeetal.,2019).Thestructure-basedmethodsperformedatthesamelevelasLocNESinrecallratebutoutperformedLocNESinspecificityandfalse-positiverate.Thus,combiningsequence-basedandstructure-basedpredictionsseemspromisinginsignificantlyimprovingtheNESprediction.Moreover,NLSdb,whichisadatabasecontainingNLSsandNESs,hasbeenrecentlyupdated(Bernhoferetal.,2018).Inthisupdate,thepotentialsetofnovelNLSsandNESshasbeengeneratedbyaninsilicomutagenesisprotocol.Then,thepotentialNLSsandNESsmatchatleastonenuclearproteinbutdonotmatchanynon-nuclearproteins.TheupdatedNLSdbcontains2,253NLSs(1,614arepotentialNLSs)and398NESs(192arepotentialNESs).ThedatawouldbeusefultofurtherimprovetheNLSandNESpredictionperformances. PredictionofSubcellularLocalizationSiteofProteininaCell Existingmethodsforpredictingsubcellularlocalizationsitescanbegroupedintofourcategories.Thefirstcategoryofpredictionmethodsusesonlysequence-basedfeatures.Somesequence-basedfeaturesareusedinlocalizationsitepredictionbecausetheirdifferencesareempiricallyknowntobecorrelatedwiththedifferencesbetweenlocalizationsites.Suchempiricalfeaturesincludethefrequencyofdipeptides,n-grams,andk-mersaswellasthepseudoaminoacidcompositionoftheentireaminoacidsequence(orthatofpredictedmaturesequence).Pseudoaminoacidcompositionismoreinformativeintermsofincorporatingsequence-orderinformationofaproteinsequence(Chou,2001).Theseempiricalsequence-basedfeatureshavealsobeenpopularinvariousaminoacidsequence-basedpredictions.Besidesthesesystematicallydefinedfeatures,sequencefeaturesofvariousknowntargetingsignalsaremoreorlessuseful,asmentionedabove.Functionalmotifsarealsousedinthepredictionbecausesequencemotifsassociatedwiththefunctionofaproteinarecloselyrelatedtoitslocalizationsite(forexample,aproteincontainingaDNA-bindingmotifislikelytobelocalizedinthenucleus).Thefirstsequence-basedmethodwasPSORT(I)(NakaiandKanehisa,1992),whichwasdevelopedabout30yearsago,andlatermanyothermethods,suchasWoLFPSORT(Hortonetal.,2007),CELLO2.5(Yuetal.,2006),andDeepLoc(AlmagroArmenterosetal.,2017),havebeendeveloped.WoLFPSORTisanupdateofPSORTII(HortonandNakai,1997),whichconvertstheinputaminoacidsequencesintoanumericalvectorconsistingofaminoacidcompositionandPSORT/iPSORT(NakaiandKanehisa,1992;Bannaietal.,2002)localizationfeatures,andthenclassifiesproteinsintosubcellularlocationswithaweightedk-NNclassifier.CELLO2.5isatwo-levelSVMclassifiersystem:thefirstlevelcomprisesanumberofSVMclassifiers,eachbasedondistinctivesetsoffeaturevectorsgeneratedfromaminoacidsequencedata,andthesecondlevelSVMclassifierfunctionsasthejurymachinetogeneratetheprobabilitydistributionofdecisionsforpossiblelocalizations.Recently,severaldeeplearning-basedpredictorsaredeveloped.DeepLocistheirrepresentative.DeepLocusesrecurrentneuralnetworks(RNNs)withlongshort-termmemory(LSTM)cellsthatprocesstheentireaminoacidsequenceandanattentionmechanismidentifyingsequenceregionsimportantforthesubcellularlocalization. Thesecondcategoryofpredictorsusesannotation-basedfeaturesobtainedfromexperimentalevidence.GOterms,localizationannotationinUniProt,functionaldomain,protein-proteininteraction,andliteratureinformationfromPubMedabstractsarecategorizedintothistypeoffeatures.mGOASVM(Wanetal.,2012)isapredictorforthesubcellularlocalizationofmulti-locationproteinsbasedonGO-terms.InmGOASVM,multi-labelGOvectors,whicharetheoccurrencesofGOtermsofhomologousproteins,areconstructed,andthenGOvectorsarerecognizedbySVMclassifiersequippedwithadecisionstrategythatcanproducemultiple-classlabelsforaqueryprotein.pLoc-mEuk(Chengetal.,2018)isrecentlydevelopedbyextractingthekeyGOinformationinto“Chou’sgeneralPseudoAminoAcidComposition.”pLoc-mEukcanalsodealwithproteinswithmultiplelocations.Generallyspeaking,however,comparedwiththosefeatures,thetransferoflocalizationannotationfromhomologousproteinseemstobesimplerandmoreuseful.Wepreviouslypointedoutthatasimplehomology-basedinferenceoutperformsmethodsbasedonmachinelearningifahomologousproteinwithlocalizationannotationisavailable(ImaiandNakai,2010). Thethirdcategoryisthepredictorscombiningsequence-basedandannotation-basedfeatures,suchasMultiLoc2(Blumetal.,2009),SherLoc2(Briesemeisteretal.,2009),YLoc(Briesemeisteretal.,2010),andLocTree3(Goldbergetal.,2014).MultiLoc2utilizesanSVMpredictor,MultiLoc(Höglundetal.,2006),whichisbasedonoverallaminoacidsandthepresenceofknownsortingsignals,combinedwithphylogeneticprofilesandGOterms.SherLoc2combinesMultiLoc2andEpiLoc(BradyandShatkay,2008),apredictionsystembasedonfeaturesderivedfromPubMedabstracts.YLocisbasedonasimplenaiveBayesclassifier,whichcombinesvariousfeaturesrangingfromsimpleaminoacidcompositiontoannotationinformation,likePROSITEdomains,andGOtermsfromclosehomologs.LocTree3improvesoveramachinelearning-basedpredictor,LocTree2(Goldbergetal.,2012),bythecombinationofthemachinelearning-basedmethodwithahomology-basedinferencetransferthroughPSI-BLAST. Thefourthcategoryistheensembleofseveralpredictionmethods(meta-servers),whichcollectspredictionscoresofseveralpredictors,andthentheyaretrainedbyamachinelearningtechnique,suchastheRandomForestclassifierandSVM.SubCons(Salvatoreetal.,2017)isarecentensemblemethod,whichcombinesfourpredictors(CELLO2.5,LocTree2,MultiLoc2,andSherLoc2)usingaRandomForestclassifier.BUSCAalsointegratesdifferentpredictionmethods.PredictionpipelineofBUSCAconsistsofpredictorsfortargetingsignals[DeepSig(Savojardoetal.,2018a)andTPpred3(Savojardoetal.,2015)],forGPI-anchors[PredGPI(Pierleonietal.,2008)],fortransmembranedomains[ENSEMBLE3.0(Martellietal.,2003)andBetAware(Savojardoetal.,2013)],andfordiscriminatorsofsubcellularlocalizationofbothglobularandmembraneproteins[BaCelLo(Pierleonietal.,2006),MemLoci(Pierleonietal.,2011),andSChloro(Savojardoetal.,2017)]. RecentBenchmarksforSubcellularLocalizationPrediction Evaluationofpredictionperformanceofsubcellularlocalizationpredictionisoftendifficultduetothefollowingreasons:(i)Thereareoftenoverlapsbetweentheirowntrainingdataandthetestdataofdifferentmethods.Inthosecases,theperformancescouldbeoverestimated.(ii)Comparisonofsequence-basedmethodswithannotation-basedmethodsormethodscombiningsequence-andannotation-basedmethodstendstobeunfair.Forexample,themeasuredaccuracyofannotation-basedmethodswouldbecomeapparentlyhigherifthemajorityoftestdatausedforsequence-basedmethodsareincludedinthedatabasesusedforthepredictionbyannotation-basedmethods. Toevaluatethepredictionperformancewithlessbias,Salvatoreetal.recentlymadeabenchmarkdatasetwhichconsistsofproteinscontainingidenticalsubcellularannotationsinatleasttwooutofthethreeresources(Salvatoreetal.,2017):twolarge-scalestudydataonsubcellularlocalizationofhumanproteins(Uhlenetal.,2010;Fagerbergetal.,2011;Breckelsetal.,2013;Christoforouetal.,2014)andproteinswith“manuallycurated”annotationofsubcellularlocalizationinUniProt(UniProtConsortium,2019).Then,theyexaminedtheperformanceofsixstate-of-the-artmethods[CELLO2.5(Yuetal.,2006),LocTree2(Goldbergetal.,2012),MultiLoc2(Blumetal.,2009),SherLoc2(Briesemeisteretal.,2009),WoLFPSORT(Hortonetal.,2007),andYLoc(Briesemeisteretal.,2010)]aswellasSubCons(Salvatoreetal.,2017)foreightlocalizationsites(nucleus,mitochondria,ER,Golgiapparatus,lysosome,peroxisome,plasmamembrane,andcytoplasm).TheyusedtheGeneralizedSquaredCorrelation(GC2;Baldietal.,2000)forperformanceevaluation.GC2isasubtypeofGorodkinmeasure(Gorodkin,2004),whichcanbeseenasageneralizationofMCCthatappliestoK-categories.TheGorodkinmeasureismoreinformativethantheaccuracymeasurewhenthereisanimbalanceofclasses.ForK=2,theGorodkinmeasuresquaredisGC2.Inthisassessment,SubConsshowedthebestoverallpredictionperformance,GC2=0.32,andthesecondbestwasSherLoc2(GC2=0.27).Ontheotherhand,duringthedevelopmentofDeepLoc(AlmagroArmenterosetal.,2017),theauthorsmadeanindependenttestsetbyperformingastringenthomologypartitioningagainstexperimentallyannotatedproteindatainUniProt.Homologousproteinsthatfulfillacertainthresholdofsimilaritywereclustered,andtheneachclusterofhomologousproteinswasassignedtooneofthefivefolds,ensuringthatsimilarproteinswerenotmixedbetweenthedifferentfolds.Fourwereusedforthetrainingandvalidationwhiletheremainingonefortesting.Usingthetestset,theycomparedthepredictionperformanceofDeepLocwiththeabovesixmethods(CELLO2.5,LocTree2,MultiLoc2,SherLoc2,WoLFPSORT,andYLoc)andiLoc-Euk(Chouetal.,2011)in10localizationsites(extracellularandplastidareaddedintotheaboveeightlocalizationsites).DeepLocshowedthebestGorodkinmeasureof0.735,andthesecondandthirdbestwereachievedbyiLoc-Eukat0.682andYLocat0.533,respectively. Althougheffortstoevaluatethepredictionperformancewithlessbiashavebeenmade,moreeffortsseemtobenecessary.Accordingtorecentbenchmarkingreportsbasedonhumandatasetsandmembraneproteins(OrioliandVihinen,2019;Shenetal.,2020),sequence-basedmethodstendtoshowlowerperformancethanannotation-basedmethods,includingmetamethods.However,acertainnumberofproteins(ortheirhighlyhomologousones)inthebenchmarktestdataseemtobeincludedinthedatabaseusedinannotation-basedmethods.Inaddition,methodstrainedandtestedwithnewlyconstructeddatatendtoshowbetterperformancebecauseolderdatatendtoincludemoremislabeledorquestionableexamples.Indeed,AlmagroArmenterosetal.(2017)pointedoutaconsiderabledecreaseofexperimentallyconfirmedproteinsinUniProtafteramajorchangeintheannotationstandardsonrelease2014_09.Thepredictionperformancesofmachinelearningalgorithmssignificantlydependonthedatasetsused.Someofthepreviouslydevelopedmethodsmayoutperformnewermethodswhentheyaretrainedandtestedwiththelatestdatasets.Forfairassessments,performancecomparisonshouldthereforebedoneineachcategorywithstandardizedbenchmarkdatasets,ensuringindependencebetweentrainingandtestdatasets.Unfortunately,tothebestofourknowledge,suchstandardizedbenchmarkdatasetshavenotbeenconstructedsofar.Thedatasetsusedinpreviousstudiesareoftenusedinthedevelopmentofnovelmethods.Thestandardizationofpredictionperformancecomparisonisabigchallengebutthisisessentialandimportantinthisfield.Recentprogressinproteome-widesubcellularproteinmapping(seebelow)wouldprovidesubstantialinformationonthesubcellularlocalizationofunverifiedorunseenproteinsaswellastheinformationforcorrectingmislabeledproteins,whichshouldbehelpfulinconstructingstandardizedbenchmarkdatasets,obviously. ProteinLocalizationResourcesObtainedFromRecentSpatialProteomicsApproaches Proteomicsdataforcapturingthespatialdistributionofproteinsatthesubcellularlevel(subcellularproteinmapping)areusefulresourcesfortheirpredictivestudies.Recentadvancesinhigh-throughputmicroscopy,quantitativemassspectrometry(MS),interactomemapping,andmachinelearningapplicationsfordataanalysishaveenabledproteome-widesubcellularproteinmapping(LundbergandBorner,2019;Borner,2020).Threeexperimentalapproachesaregenerallyusedforspatialproteomics:proteome-wideimagingofproteinlocalization,protein–proteininteractionnetworkanalysis,andMS-basedorganelleprofiling.Alloftheseapproacheshaveproducednumerousavailabledataofhumanproteinsubcellularlocalization.TheHumanCellAtlasprovidesaninvaluableresourceofimagingdataatasingle-celllevel(localizationof12,003proteins;Thuletal.,2017).Theglobalorganellarmapbasedonbiotinidentification(BioID)dataisnowavailableasaresourceofprotein–proteininteractionnetworkanalysis(4,145proteins;Goetal.,2019).Severalorganelleprofilingresourcesareobtainedfromfibroblasts(2,533proteins;JeanBeltranetal.,2016)andcelllines:HeLa(8,710proteins;Itzhaketal.,2016),fivedifferentcancercelllines(12,418proteins;Orreetal.,2019),andU-2OS(2,412proteins;Geladakietal.,2019).Inaddition,organelleprofilingresourcesofmouseprimaryneurons(Itzhaketal.,2017),mouseliver(Krahmeretal.,2018),mousepluripotentstemcell(Christoforouetal.,2016),ratliver(Jadotetal.,2017),andSaccharomycescerevisiae(Nightingaleetal.,2019)arealsoavailable. Eachoftheseapproacheshasitsownmeritsfortheidentificationofproteinlocalization:theimagingapproachprovidesmultiplelocalizationsandhasasingle-cellresolutionwhiletheMS-basedapproachcanprovidepeptide-levelresolutionandrevealthedifferentiallocalizationofsplicingisoforms,proteolyticallyprocessedforms,andtheisoformsviadifferentialpost-translationalmodifications.Arecentimaging-basedlarge-scalestudyreportsthataboutahalfofallproteinsarelocalizedatmultiplecompartments,suggestingthatthereisasharedpoolofproteinsevenamongfunctionallyunrelatedorganelles(Thuletal.,2017).Predictionofproteinsthatexistintwoormoresubcellularlocationsitesisanimportantissueforunderstandingthebiologicalprocessinacell.Arecentreviewsummarizesthepredictionmethodsthatcandealwithproteinswithmultiplelocations(Chou,2019). AnumberofdifferentiallylocalizedisoformpairswerefoundbyMS-basedapproaches(Christoforouetal.,2016;Geladakietal.,2019).Suchlocalizationchangeattheisoformlevelisaninterestingissueintermsoftargetingsignalusage.Proteinisoformsseemtobegeneratedbyastressresponseorinatissue-specificmanner.Thus,anumberoflocalizationchangesattheisoformlevelmayhavebeenunidentifiedstill.Formitochondrialproteins,wepreviouslyappliedMitoFatestosearchfordifferentially-localizedcandidatesofisoformsandobtained517genes,whichwere44%ofthepredictedmitochondrialgenes(Fukasawaetal.,2015),suggestingthatthemajorlocalizationchangesofmitochondrialproteinisoformsareregulatedbythechangesintheirN-terminaltargetingsignal.Recently,relativeproteinlevelsofmorethan12,000genesacross32normalhumantissueswerequantifiedandtissue-specificortissue-enrichedproteinswereidentified(Jiangetal.,2020).Also,theyidentifiedatotalof2,436tissue-enrichedproteinisoforms.Thoseisoformsmaybeusefulfortheinvestigationoftissue-specificlocalizationchangesattheisoformlevel. Multiplelocalizationproteinsandlocalizationchangesamongisoformsimplypotential“moonlighting”activity.Comprehensiveanalysesoftheseproteinsshouldboostourfurtherunderstandingincellbiology. Conclusion Anumberofcomputationaltoolsfortheanalysesofproteinsubcellularlocalizationareintroducedinthisreview.Althoughmanyofthelocalizationsitesofagivenproteinwouldbeabletobepredictedthroughamerehomologytransfernowadays,wewouldliketoemphasizethatthesubcellularlocalizationpredictionproblemisnotapedanticoneatall.Theauthorsbelievethattheinsilicoaccumulationofvariousknowledgeonproteinsorting/targetingprocessesisimportant.Predictionmethodscanbeusedforassessinghowmuchweunderstandtheseprocessesquantitatively.Thefuturemethodsshouldbeusefulforvariouspurposes,suchasfortheevaluationofartificialproteins,forunderstandingwhysomeproteinsarelocalizedatmultiplepositionsandforinferringhowtissue-specificand/orcondition-specificisoformscanchangetheirlocalizationsites.Therefore,inouropinion,theknowledge-basedapproachwouldbemostimportantinthefutureofthisfieldandsuchknowledgeshouldbeintegratedintothewiderknowledgeontheinvivofateofproteinssincealloftheprocessesareinterrelatedwitheachother(Nakai,2001). AuthorContributions Boththeauthorslistedhavemadeasubstantial,directandintellectualcontributiontothework,andapproveditforpublication. Funding KIacknowledgessupportfromJSPSKAKENHI(grantnumber18K11543). ConflictofInterest Theauthorsdeclarethattheresearchwasconductedintheabsenceofanycommercialorfinancialrelationshipsthatcouldbeconstruedasapotentialconflictofinterest. References AlmagroArmenteros,J.J.,Sønderby,C.K.,Sønderby,S.K.,Nielsen,H.,andWinther,O.(2017).DeepLoc:predictionofproteinsubcellularlocalizationusingdeeplearning.Bioinformatics33,3387–3395.doi:10.1093/bioinformatics/btx431 PubMedAbstract|CrossRefFullText|GoogleScholar AlmagroArmenteros,J.J.,Tsirigos,K.D.,Sønderby,C.K.,Petersen,T.N.,Winther,O.,Brunak,S.,etal.(2019).SignalP5.0improvessignalpeptidepredictionsusingdeepneuralnetworks.Nat.Biotechnol.37,420–423.doi:10.1038/s41587-019-0036-z PubMedAbstract|CrossRefFullText|GoogleScholar Araiso,Y.,Tsutsumi,A.,Qiu,J.,Imai,K.,Shiota,T.,Song,J.,etal.(2019).Structureofthemitochondrialimportgaterevealsdistinctpreproteinpaths.Nature575,395–401.doi:10.1038/s41586-019-1680-7 PubMedAbstract|CrossRefFullText|GoogleScholar Armenteros,J.J.A.,Salvatore,M.,Emanuelsson,O.,Winther,O.,vonHeijne,G.,Elofsson,A.,etal.(2019).Detectingsequencesignalsintargetingpeptidesusingdeeplearning.LifeSci.Alliance2:e201900429.doi:10.26508/lsa.201900429 PubMedAbstract|CrossRefFullText|GoogleScholar Audagnotto,M.,andDalPeraro,M.(2017).Proteinpost-translationalmodifications:insilicopredictiontoolsandmolecularmodeling.Comput.Struct.Biotechnol.J.15,307–319.doi:10.1016/j.csbj.2017.03.004 PubMedAbstract|CrossRefFullText|GoogleScholar Ba,A.N.N.,Pogoutse,A.,Provart,N.,andMoses,A.M.(2009).NLStradamus:asimpleHiddenMarkovModelfornuclearlocalizationsignalprediction.BMCBioinformatics10:202.doi:10.1186/1471-2105-10-202 PubMedAbstract|CrossRefFullText|GoogleScholar Bakelar,J.,Buchanan,S.K.,andNoinaj,N.(2017).Structuralsnapshotsoftheβ-barrelassemblymachinery.FEBSJ.284,1778–1786.doi:10.1111/febs.13960 PubMedAbstract|CrossRefFullText|GoogleScholar Baker,D.(2019).Whathasdenovoproteindesigntaughtusaboutproteinfoldingandbiophysics?ProteinSci.28,678–683.doi:10.1002/pro.3588 PubMedAbstract|CrossRefFullText|GoogleScholar Baldi,P.,Brunak,S.,Chauvin,Y.,Andersen,C.A.F.,andNielsen,H.(2000).Assessingtheaccuracyofpredictionalgorithmsforclassification:anoverview.Bioinformatics16,412–424.doi:10.1093/bioinformatics/16.5.412 PubMedAbstract|CrossRefFullText|GoogleScholar Bannai,H.,Tamada,Y.,Maruyama,O.,Nakai,K.,andMiyano,S.(2002).ExtensivefeaturedetectionofN-terminalproteinsortingsignals.Bioinformatics18,298–305.doi:10.1093/bioinformatics/18.2.298 PubMedAbstract|CrossRefFullText|GoogleScholar Bernhofer,M.,Goldberg,T.,Wolf,S.,Ahmed,M.,Zaugg,J.,Boden,M.,etal.(2018).NLSdb-majorupdatefordatabaseofnuclearlocalizationsignalsandnuclearexportsignals.NucleicAcidsRes.46,D503–D508.doi:10.1093/nar/gkx1021 PubMedAbstract|CrossRefFullText|GoogleScholar Bhushan,S.,Kuhn,C.,Berglund,A.K.,Roth,C.,andGlaser,E.(2006).TheroleoftheN-terminaldomainofchloroplasttargetingpeptidesinorganellarproteinimportandmiss-sorting.FEBSLett.580,3966–3972.doi:10.1016/j.febslet.2006.06.018 PubMedAbstract|CrossRefFullText|GoogleScholar Blum,T.,Briesemeister,S.,andKohlbacher,O.(2009).MultiLoc2:integratingphylogenyandgeneontologytermsimprovessubcellularproteinlocalizationprediction.BMCBioinformatics10:274.doi:10.1186/1471-2105-10-274 PubMedAbstract|CrossRefFullText|GoogleScholar Borner,G.H.H.(2020).Organellarmapsthroughproteomicprofiling-aconceptualguide.Mol.Cell.Proteomics19,1076–1087.doi:10.1074/mcp.R120.001971 PubMedAbstract|CrossRefFullText|GoogleScholar Bourgeois,B.,Hutten,S.,Gottschalk,B.,Hofweber,M.,Richter,G.,andSternat,J.(2020).NonclassicalnuclearlocalizationsignalsmediatenuclearimportofCIRBP.Proc.Natl.Acad.Sci.U.S.A.117,8503–8514.doi:10.1073/pnas.1918944117 PubMedAbstract|CrossRefFullText|GoogleScholar Brady,S.,andShatkay,H.(2008).EPILOC:a(working)text-basedsystemforpredictingproteinsubcellularlocation.Pac.Symp.Biocomput.13,604–615.doi:10.1142/9789812776136_0058 PubMedAbstract|CrossRefFullText|GoogleScholar Brameier,M.,Krings,A.,andMacCallum,R.M.(2007).NucPred—predictingnuclearlocalizationofproteins.Bioinformatics23,1159–1160.doi:10.1093/bioinformatics/btm066 PubMedAbstract|CrossRefFullText|GoogleScholar Breckels,L.M.,Gatto,L.,Christoforou,A.,Groen,A.J.,Lilley,K.S.,andTrotter,M.W.B.(2013).Theeffectoforganellediscoveryuponsub-cellularproteinlocalisation.J.Proteomics88,129–140.doi:10.1016/j.jprot.2013.02.019 PubMedAbstract|CrossRefFullText|GoogleScholar Briesemeister,S.,Blum,T.,Brady,S.,Lam,Y.,Kohlbacher,O.,andShatkay,H.(2009).SherLoc2:ahigh-accuracyhybridmethodforpredictingsubcellularlocalizationofproteins.J.ProteomeRes.8,5363–5366.doi:10.1021/pr900665y PubMedAbstract|CrossRefFullText|GoogleScholar Briesemeister,S.,Rahnenführer,J.,andKohlbacher,O.(2010).Goingfromwheretowhy-interpretablepredictionofproteinsubcellularlocalization.Bioinformatics26,1232–1238.doi:10.1093/bioinformatics/btq115 PubMedAbstract|CrossRefFullText|GoogleScholar Bruce,B.D.(2001).Theparadoxofplastidtransitpeptides:conservationoffunctiondespitedivergenceinprimarystructure.Biochim.Biophys.Acta1541,2–21.doi:10.1016/s0167-4889(01)00149-5 PubMedAbstract|CrossRefFullText|GoogleScholar Calvo,S.E.,Julien,O.,Clauser,K.R.,Shen,H.,Kamer,K.J.,Wells,J.A.,etal.(2017).ComparativeanalysisofmitochondrialN-terminifrommouse,human,andyeast.Mol.Cell.Proteomics16,512–523.doi:10.1074/mcp.M116.063818 PubMedAbstract|CrossRefFullText|GoogleScholar Chacinska,A.,Koehler,C.M.,Milenkovic,D.,Lithgow,T.,andPfanner,N.(2009).Importingmitochondrialproteins:machineriesandmechanisms.Cell138,628–644.doi:10.1016/j.cell.2009.08.005 PubMedAbstract|CrossRefFullText|GoogleScholar Cheng,X.,Xiao,X.,andChou,K.C.(2018).pLoc-mEuk:predictsubcellularlocalizationofmulti-labeleukaryoticproteinsbyextractingthekeyGOinformationintogeneralPseAAC.Genomics110,50–58.doi:10.1016/j.ygeno.2017.08.005 PubMedAbstract|CrossRefFullText|GoogleScholar Choo,K.H.,Tan,T.W.,andRanganathan,S.(2009).AcomprehensiveassessmentofN-terminalsignalpeptidespredictionmethods.BMCBioinformatics10:S2.doi:10.1186/1471-2105-10-S15-S2 PubMedAbstract|CrossRefFullText|GoogleScholar Chou,K.C.(2001).Predictionofproteincellularattributesusingpseudo-aminoacidcomposition.ProteinsStruct.Funct.Genet.43,246–255.doi:10.1002/prot.1035 PubMedAbstract|CrossRefFullText|GoogleScholar Chou,K.C.(2019).Advancesinpredictingsubcellularlocalizationofmulti-labelproteinsanditsimplicationfordevelopingmulti-targetdrugs.Curr.Med.Chem.26,4918–4943.doi:10.2174/0929867326666190507082559 PubMedAbstract|CrossRefFullText|GoogleScholar Chou,K.C.,Wu,Z.C.,andXiao,X.(2011).iLoc-Euk:amulti-labelclassifierforpredictingthesubcellularlocalizationofsingleplexandmultiplexeukaryoticproteins.PLoSOne6:e18258.doi:10.1371/journal.pone.0018258 PubMedAbstract|CrossRefFullText|GoogleScholar Christoforou,A.,Arias,A.M.,andLilley,K.S.(2014).DeterminingproteinsubcellularlocalizationinmammaliancellculturewithbiochemicalfractionationandiTRAQ8-plexquantification.MethodsMol.Biol.1156,157–174.doi:10.1007/978-1-4939-0685-7_10 PubMedAbstract|CrossRefFullText|GoogleScholar Christoforou,A.,Mulvey,C.M.,Breckels,L.M.,Geladaki,A.,Hurrell,T.,Hayward,P.C.,etal.(2016).Adraftmapofthemousepluripotentstemcellspatialproteome.Nat.Commun.7:8992.doi:10.1038/ncomms9992 PubMedAbstract|CrossRefFullText|GoogleScholar Claros,M.G.(1995).MitoProt,amacintoshapplicationforstudyingmitochondrialproteins.Bioinformatics11,441–447.doi:10.1093/bioinformatics/11.4.441 PubMedAbstract|CrossRefFullText|GoogleScholar Du,P.,andXu,C.(2013).Predictingmultisiteproteinsubcellularlocations:progressandchallenges.ExpertRev.Proteomics10,227–237.doi:10.1586/epr.13.16 PubMedAbstract|CrossRefFullText|GoogleScholar ElArnaout,T.,andSoulimane,T.(2019).Targetinglipoproteinbiogenesis:considerationstowardsantimicrobials.TrendsBiochem.Sci.44,701–715.doi:10.1016/j.tibs.2019.03.007 PubMedAbstract|CrossRefFullText|GoogleScholar Eldeeb,M.A.,Siva-Piragasam,R.,Ragheb,M.A.,Esmaili,M.,Salla,M.,andFahlman,R.P.(2019).Amoleculartoolboxforstudyingproteindegradationinmammaliancells.J.Neurochem.151,520–533.doi:10.1111/jnc.14838 PubMedAbstract|CrossRefFullText|GoogleScholar Emanuelsson,O.,Nielsen,H.,Brunak,S.,andvonHeijne,G.(2000).PredictingsubcellularlocalizationofproteinsbasedontheirN-terminalaminoacidsequence.J.Mol.Biol.300,1005–1016.doi:10.1006/jmbi.2000.3903 PubMedAbstract|CrossRefFullText|GoogleScholar Fagerberg,L.,Stadler,C.,Skogs,M.,Hjelmare,M.,Jonasson,K.,Wiking,M.,etal.(2011).Mappingthesubcellularproteindistributioninthreehumancelllines.J.ProteomeRes.10,3766–3777.doi:10.1021/pr200379a PubMedAbstract|CrossRefFullText|GoogleScholar Fariselli,P.,Finocchiaro,G.,andCasadio,R.(2003).SPEPlip:thedetectionofsignalpeptideandlipoproteincleavagesites.Bioinformatics19,2498–2499.doi:10.1093/bioinformatics/btg360 PubMedAbstract|CrossRefFullText|GoogleScholar Fu,S.C.,Imai,K.,andHorton,P.(2011).Predictionofleucine-richnuclearexportsignalcontainingproteinswithNESsential.NucleicAcidsRes.39:e111.doi:10.1093/nar/gkr493 PubMedAbstract|CrossRefFullText|GoogleScholar Fukasawa,Y.,Tsuji,J.,Fu,S.C.,Tomii,K.,Horton,P.,andImai,K.(2015).MitoFates:improvedpredictionofmitochondrialtargetingsequencesandtheircleavagesites.Mol.Cell.Proteomics14,1113–1126.doi:10.1074/mcp.M114.043083 PubMedAbstract|CrossRefFullText|GoogleScholar Fung,H.Y.J.,Fu,S.C.,Brautigam,C.A.,andChook,Y.M.(2015).StructuraldeterminantsofnuclearexportsignalorientationinbindingtoexportinCRM1.eLife4:e10034.doi:10.7554/eLife.10034 PubMedAbstract|CrossRefFullText|GoogleScholar Fung,H.Y.J.,Fu,S.C.,andChook,Y.M.(2017).NuclearexportreceptorCRM1recognizesdiverseconformationsinnuclearexportsignals.eLife6:e23961.doi:10.7554/eLife.23961 PubMedAbstract|CrossRefFullText|GoogleScholar Gardy,J.L.,andBrinkman,F.S.L.(2006).Methodsforpredictingbacterialproteinsubcellularlocalization.Nat.Rev.Microbiol.4,741–751.doi:10.1038/nrmicro1494 PubMedAbstract|CrossRefFullText|GoogleScholar Gardy,J.L.,Spencer,C.,Wang,K.,Ester,M.,Tusnády,G.E.,Simon,I.,etal.(2003).PSORT-B:improvingproteinsubcellularlocalizationpredictionforgram-negativebacteria.NucleicAcidsRes.31,3613–3617.doi:10.1093/nar/gkg602 PubMedAbstract|CrossRefFullText|GoogleScholar Gavel,Y.,andvonHeijne,G.(1990).Aconservedcleavage-sitemotifinchloroplasttransitpeptides.FEBSLett.261,455–458.doi:10.1016/0014-5793(90)80614-O PubMedAbstract|CrossRefFullText|GoogleScholar Ge,C.,Spånning,E.,Glaser,E.,andWieslander,Å.(2014).Importdeterminantsoforganelle-specificanddualtargetingpeptidesofmitochondriaandchloroplastsinArabidopsisthaliana.Mol.Plant7,121–136.doi:10.1093/mp/sst148 PubMedAbstract|CrossRefFullText|GoogleScholar Geladaki,A.,KočevarBritovšek,N.,Breckels,L.M.,Smith,T.S.,Vennard,O.L.,Mulvey,C.M.,etal.(2019).CombiningLOPITwithdifferentialultracentrifugationforhigh-resolutionspatialproteomics.Nat.Commun.10:331.doi:10.1038/s41467-018-08191-w PubMedAbstract|CrossRefFullText|GoogleScholar Go,C.,Knight,J.,Rajasekharan,A.,Rathod,B.,Hesketh,G.,Abe,K.,etal.(2019).Aproximitybiotinylationmapofahumancell.bioRxiv[Preprint].doi:10.1101/796391 CrossRefFullText|GoogleScholar Goldberg,T.,Hamp,T.,andRost,B.(2012).LocTree2predictslocalizationforalldomainsoflife.Bioinformatics28,i458–i465.doi:10.1093/bioinformatics/bts390 PubMedAbstract|CrossRefFullText|GoogleScholar Goldberg,T.,Hecht,M.,Hamp,T.,Karl,T.,Yachdav,G.,Ahmed,N.,etal.(2014).LocTree3predictionoflocalization.NucleicAcidsRes.42,W350–W355.doi:10.1093/nar/gku396 PubMedAbstract|CrossRefFullText|GoogleScholar Gorodkin,J.(2004).ComparingtwoK-categoryassignmentsbyaK-categorycorrelationcoefficient.Comput.Biol.Chem.28,367–374.doi:10.1016/j.compbiolchem.2004.09.006 PubMedAbstract|CrossRefFullText|GoogleScholar Guo,Y.,Yang,Y.,Huang,Y.,andShen,H.B.(2020).Discoveringnucleartargetingsignalsequencethroughproteinlanguagelearningandmultivariateanalysis.Anal.Biochem.591:113565.doi:10.1016/j.ab.2019.113565 PubMedAbstract|CrossRefFullText|GoogleScholar HarveyMillar,A.,andTaylor,N.L.(2014).Subcellularproteomics-wherecellbiologymeetsproteinchemistry.Front.PlantSci.5:55.doi:10.3389/fpls.2014.00055 PubMedAbstract|CrossRefFullText|GoogleScholar Höglund,A.,Dönnes,P.,Blum,T.,Adolph,H.W.,andKohlbacher,O.(2006).MultiLoc:predictionofproteinsubcellularlocalizationusingN-terminaltargetingsequences,sequencemotifsandaminoacidcomposition.Bioinformatics22,1158–1165.doi:10.1093/bioinformatics/btl002 PubMedAbstract|CrossRefFullText|GoogleScholar Horton,P.,andNakai,K.(1997).Betterpredictionofproteincellularlocalizationsiteswiththeknearestneighborsclassifier.Proc.Int.Conf.Intell.Syst.Mol.Biol.5,147–152. PubMedAbstract|GoogleScholar Horton,P.,Park,K.J.,Obayashi,T.,Fujita,N.,Harada,H.,Adams-Collier,C.J.,etal.(2007).WoLFPSORT:proteinlocalizationpredictor.NucleicAcidsRes.35,W585–W587.doi:10.1093/nar/gkm259 PubMedAbstract|CrossRefFullText|GoogleScholar Hutten,S.,andKehlenbach,R.H.(2007).CRM1-mediatednuclearexport:totheporeandbeyond.TrendsCellBiol.17,193–201.doi:10.1016/j.tcb.2007.02.003 PubMedAbstract|CrossRefFullText|GoogleScholar Imai,K.,andNakai,K.(2010).Predictionofsubcellularlocationsofproteins:wheretoproceed?Proteomics10,3970–3983.doi:10.1002/pmic.201000274 PubMedAbstract|CrossRefFullText|GoogleScholar Imai,K.,andNakai,K.(2019).“Predictionofproteinlocalization”inEncyclopediaofBioinformaticsandComputationalBiology,Vol.2.Elsevier,53–59. GoogleScholar Itzhak,D.N.,Davies,C.,Tyanova,S.,Mishra,A.,Williamson,J.,Antrobus,R.,etal.(2017).Amassspectrometry-basedapproachformappingproteinsubcellularlocalizationrevealsthespatialproteomeofmouseprimaryneurons.CellRep.20,2706–2718.doi:10.1016/j.celrep.2017.08.063 PubMedAbstract|CrossRefFullText|GoogleScholar Itzhak,D.N.,Tyanova,S.,Cox,J.,andBorner,G.H.H.(2016).Global,quantitativeanddynamicmappingofproteinsubcellularlocalization.eLife5:e16950.doi:10.7554/eLife.16950 PubMedAbstract|CrossRefFullText|GoogleScholar Ivankov,D.N.,Payne,S.H.,Galperin,M.Y.,Bonissone,S.,Pevzner,P.A.,andFrishman,D.(2013).Howmanysignalpeptidesarethereinbacteria?Environ.Microbiol.15,983–990.doi:10.1111/1462-2920.12105 PubMedAbstract|CrossRefFullText|GoogleScholar Jadot,M.,Boonen,M.,Thirion,J.,Wang,N.,Xing,J.,Zhao,C.,etal.(2017).Accountingforproteinsubcellularlocalization:acompartmentalmapoftheratliverproteome.Mol.Cell.Proteomics16,194–212.doi:10.1074/mcp.M116.064527 PubMedAbstract|CrossRefFullText|GoogleScholar Jäkel,S.,andGörlich,D.(1998).Importinβ,transportin,RanBP5andRanBP7mediatenuclearimportofribosomalproteinsinmammaliancells.EMBOJ.17,4491–4502.doi:10.1093/emboj/17.15.4491 PubMedAbstract|CrossRefFullText|GoogleScholar Jarvis,P.(2008).Targetingofnucleus-encodedproteinstochloroplastsinplants.NewPhytol.179,257–285.doi:10.1111/j.1469-8137.2008.02452.x PubMedAbstract|CrossRefFullText|GoogleScholar JeanBeltran,P.M.,Mathias,R.A.,andCristea,I.M.(2016).Aportraitofthehumanorganelleproteomeinspaceandtimeduringcytomegalovirusinfection.CellSyst.3,361–373.e6.doi:10.1016/j.cels.2016.08.012 PubMedAbstract|CrossRefFullText|GoogleScholar Jiang,L.,Wang,M.,Lin,S.,Jian,R.,Li,X.,Chan,J.,etal.(2020).Aquantitativeproteomemapofthehumanbody.Cell183,269–283.e19.doi:10.1016/j.cell.2020.08.036 PubMedAbstract|CrossRefFullText|GoogleScholar Kanapin,A.,Batalov,S.,Davis,M.J.,Gough,J.,Grimmond,S.,Kawaji,H.,etal.(2003).Mouseproteomeanalysis.GenomeRes.13,1335–1344.doi:10.1101/gr.978703 PubMedAbstract|CrossRefFullText|GoogleScholar Kimura,M.,andImamoto,N.(2014).Biologicalsignificanceoftheimportin-βfamily-dependentnucleocytoplasmictransport.Traffic15,727–748.doi:10.1111/tra.12174 PubMedAbstract|CrossRefFullText|GoogleScholar Kimura,M.,Morinaka,Y.,Imai,K.,andKose,S.(2017).Extensivecargoidentificationrevealsdistinctbiologicalrolesofthe12importinpathways.eLife6:e21184.doi:10.7554/eLife.21184 PubMedAbstract|CrossRefFullText|GoogleScholar Kosugi,S.,Hasebe,M.,Entani,T.,Takayama,S.,Tomita,M.,andYanagawa,H.(2008a).Articledesignofpeptideinhibitorsfortheimportinα/βnuclearimportpathwaybyactivity-basedprofiling.Chem.Biol.15,940–949.doi:10.1016/j.chembiol.2008.07.019 PubMedAbstract|CrossRefFullText|GoogleScholar Kosugi,S.,Hasebe,M.,Matsumura,N.,Takashima,H.,Miyamoto-sato,E.,Tomita,M.,etal.(2009).Sixclassesofnuclearlocalizationsignalsspecifictodifferentbindinggroovesofimportinα.J.Biol.Chem.284,478–485.doi:10.1074/jbc.M807017200 PubMedAbstract|CrossRefFullText|GoogleScholar Kosugi,S.,Hasebe,M.,Tomita,M.,andYanagawa,H.(2008b).Nuclearexportsignalconsensussequencesdefinedusingalocalization-basedyeastselectionsystem.Traffic9,2053–2062.doi:10.1111/j.1600-0854.2008.00825.x PubMedAbstract|CrossRefFullText|GoogleScholar Kosugi,S.,Yanagawa,H.,Terauchi,R.,andTabata,S.(2014).NESmapper:accuratepredictionofleucine-richnuclearexportsignalsusingactivity-basedprofiles.PLoSComput.Biol.10:e1003841.doi:10.1371/journal.pcbi.1003841 PubMedAbstract|CrossRefFullText|GoogleScholar Krahmer,N.,Najafi,B.,Schueder,F.,Quagliarini,F.,Steger,M.,Seitz,S.,etal.(2018).OrganellarproteomicsandPhospho-proteomicsrevealsubcellularreorganizationindiet-inducedhepaticsteatosis.Dev.Cell47,205–221.e7.doi:10.1016/j.devcel.2018.09.017 PubMedAbstract|CrossRefFullText|GoogleScholar Krogh,A.,Sonnhammer,E.L.L.,andKa,L.(2007).Advantagesofcombinedtransmembranetopologyandsignalpeptideprediction—thePhobiuswebserver.NucleicAcidsRes.35,W429–W432.doi:10.1093/nar/gkm256 PubMedAbstract|CrossRefFullText|GoogleScholar LaCour,T.,Kiemer,L.,Mølgaard,A.,Gupta,R.,Skriver,K.,andBrunak,S.(2004).Analysisandpredictionofleucine-richnuclearexportsignals.ProteinEng.Des.Sel.17,527–536.doi:10.1093/protein/gzh062 PubMedAbstract|CrossRefFullText|GoogleScholar Lange,A.,Mills,R.E.,Lange,C.J.,Stewart,M.,Devine,S.E.,andCorbett,A.H.(2007).Classicalnuclearlocalizationsignals:definition,function,andinteractionwithimportinα.J.Biol.Chem.282,5101–5105.doi:10.1074/jbc.R600026200 PubMedAbstract|CrossRefFullText|GoogleScholar Lee,B.J.,Cansizoglu,A.E.,Su,K.E.,Louis,T.H.,Zhang,Z.,andChook,Y.M.(2006).Rulesfornuclearlocalizationsequencerecognitionbykaryopherinβ2.Cell126,543–558.doi:10.1016/j.cell.2006.05.049 PubMedAbstract|CrossRefFullText|GoogleScholar Lee,D.W.,Lee,S.,Lee,J.,Woo,S.,Razzak,M.A.,Vitale,A.,etal.(2019).Molecularmechanismofthespecificityofproteinimportintochloroplastsandmitochondriainplantcells.Mol.Plant12,951–966.doi:10.1016/j.molp.2019.03.003 PubMedAbstract|CrossRefFullText|GoogleScholar Lertampaiporn,S.,Nuannimnoi,S.,Vorapreeda,T.,Chokesajjawatee,N.,Visessanguan,W.,andThammarongtham,C.(2019).PSO-LocBact:aconsensusmethodforoptimizingmultipleclassifierresultsforpredictingthesubcellularlocalizationofbacterialproteins.Biomed.Res.Int.2019:5617153.doi:10.1155/2019/5617153 PubMedAbstract|CrossRefFullText|GoogleScholar Li,H.M.,andChiu,C.C.(2010).Proteintransportintochloroplasts.Annu.Rev.PlantBiol.61,157–180.doi:10.1146/annurev-arplant-042809-112222 PubMedAbstract|CrossRefFullText|GoogleScholar Liku,M.E.,Legere,E.A.,andMoses,A.M.(2018).NoLogo:anewstatisticalmodelhighlightsthediversityandsuggestsnewclassesofCrm1-dependentnuclearexportsignals.BMCBioinformatics19:65.doi:10.1186/s12859-018-2076-7 PubMedAbstract|CrossRefFullText|GoogleScholar Lin,J.,andHu,J.(2013).SeqNLS:nuclearlocalizationsignalpredictionbasedonfrequentpatternminingandlinearmotifscoring.PLoSOne8:e76864.doi:10.1371/journal.pone.0076864 PubMedAbstract|CrossRefFullText|GoogleScholar Lisitsyna,O.M.,Seplyarskiy,V.B.,andSheval,E.V.(2017).Comparativeanalysisofnuclearlocalizationsignal(NLS)predictionmethods.Biopolym.Cell33,147–154.doi:10.7124/bc.00094C CrossRefFullText|GoogleScholar Lundberg,E.,andBorner,G.H.H.(2019).Spatialproteomics:apowerfuldiscoverytoolforcellbiology.Nat.Rev.Mol.CellBiol.20,285–302.doi:10.1038/s41580-018-0094-y PubMedAbstract|CrossRefFullText|GoogleScholar Maertens,G.N.,Cook,N.J.,Wang,W.,Hare,S.,Shree,S.,andÖztop,I.(2014).Structuralbasisfornuclearimportofsplicingfactorsbyhumantransportin3.Proc.Natl.Acad.Sci.U.S.A.111,2728–2733.doi:10.1073/pnas.1320755111 PubMedAbstract|CrossRefFullText|GoogleScholar Martelli,P.L.,Fariselli,P.,andCasadio,R.(2003).AnENSEMBLEmachinelearningapproachforthepredictionofall-alphamembraneproteins.Bioinformatics19,i205–i211.doi:10.1093/bioinformatics/btg1027 PubMedAbstract|CrossRefFullText|GoogleScholar Mathur,D.,Singh,S.,Mehta,A.,Agrawal,P.,andRaghava,G.P.S.(2018).Insilicoapproachesforpredictingthehalf-lifeofnaturalandmodifiedpeptidesinblood.PLoSOne13:e0196829.doi:10.1371/journal.pone.0196829 PubMedAbstract|CrossRefFullText|GoogleScholar Mehdi,A.M.,Sehgal,M.S.B.,Kobe,B.,Bailey,T.L.,andBodén,M.(2011).Aprobabilisticmodelofnuclearimportofproteins.Bioinformatics27,1239–1246.doi:10.1093/bioinformatics/btr121 PubMedAbstract|CrossRefFullText|GoogleScholar Mossmann,D.,Meisinger,C.,andVögtle,F.N.(2012).Processingofmitochondrialpresequences.Biochim.Biophys.ActaGeneRegul.Mech.1819,1098–1106.doi:10.1016/j.bbagrm.2011.11.007 PubMedAbstract|CrossRefFullText|GoogleScholar Nakai,K.(2001).Review:predictionofinvivofatesofproteinsintheeraofgenomicsandproteomics.J.Struct.Biol.134,103–116.doi:10.1006/jsbi.2001.4378 PubMedAbstract|CrossRefFullText|GoogleScholar Nakai,K.,andKanehisa,M.(1991).Expertsystemforpredictingproteinlocalizationsitesingram-negativebacteria.ProteinsStruct.Funct.Bioinforma.11,95–110.doi:10.1002/prot.340110203 PubMedAbstract|CrossRefFullText|GoogleScholar Nakai,K.,andKanehisa,M.(1992).Aknowledgebaseforpredictingproteinlocalizationsitesineukaryoticcells.Genomics14,897–911.doi:10.1016/S0888-7543(05)80111-9 PubMedAbstract|CrossRefFullText|GoogleScholar Nielsen,H.(2017).Proteinsortingprediction.MethodsMol.Biol.1615,23–57.doi:10.1007/978-1-4939-7033-9_2 PubMedAbstract|CrossRefFullText|GoogleScholar Nielsen,H.,Tsirigos,K.D.,Brunak,S.,andvonHeijne,G.(2019).Abriefhistoryofproteinsortingprediction.ProteinJ.38,200–216.doi:10.1007/s10930-019-09838-3 PubMedAbstract|CrossRefFullText|GoogleScholar Nightingale,D.J.H.,Oliver,S.G.,andLilley,K.S.(2019).MappingtheSaccharomycescerevisiaespatialproteomewithhighresolutionusinghyperLOPIT.MethodsMol.Biol.2049,165–190.doi:10.1007/978-1-4939-9736-7_10 PubMedAbstract|CrossRefFullText|GoogleScholar Nilsson,I.,Lara,P.,Hessa,T.,Johnson,A.E.,vonHeijne,G.V.,andKaramyshev,A.L.(2015).ThecodefordirectingproteinsfortranslocationacrossERmembrane:SRPcotranslationallyrecognizesspecificfeaturesofasignalsequence.J.Mol.Biol.427,1191–1201.doi:10.1016/j.jmb.2014.06.014 PubMedAbstract|CrossRefFullText|GoogleScholar Orioli,T.,andVihinen,M.(2019).Benchmarkingsubcellularlocalizationandvarianttolerancepredictorsonmembraneproteins.BMCGenomics20:547.doi:10.1186/s12864-019-5865-0 PubMedAbstract|CrossRefFullText|GoogleScholar Orre,L.M.,Vesterlund,M.,Pan,Y.,Arslan,T.,Zhu,Y.,FernandezWoodbridge,A.,etal.(2019).SubCellBarCode:proteome-widemappingofproteinlocalizationandrelocalization.Mol.Cell73,166–182.e7.doi:10.1016/j.molcel.2018.11.035 PubMedAbstract|CrossRefFullText|GoogleScholar Paila,Y.D.,Richardson,L.G.L.,andSchnell,D.J.(2015).Newinsightsintothemechanismofchloroplastproteinimportanditsintegrationwithproteinqualitycontrol,organellebiogenesisanddevelopment.J.Mol.Biol.427,1038–1060.doi:10.1016/j.jmb.2014.08.016 PubMedAbstract|CrossRefFullText|GoogleScholar Palmer,T.,andStansfeld,P.J.(2020).Targetingofproteinstothetwin-argininetranslocationpathway.Mol.Microbiol.113,861–871.doi:10.1111/mmi.14461 PubMedAbstract|CrossRefFullText|GoogleScholar Paramasivam,N.,andLinke,D.(2011).Clubsub-P:cluster-basedsubcellularlocalizationpredictionforgram-negativebacteriaandarchaea.Front.Microbiol.2:218.doi:10.3389/fmicb.2011.00218 PubMedAbstract|CrossRefFullText|GoogleScholar Peabody,M.A.,Laird,M.R.,Vlasschaert,C.,Lo,R.,andBrinkman,F.S.L.(2016).PSORTdb:expandingthebacteriaandarchaeaproteinsubcellularlocalizationdatabasetobetterreflectdiversityincellenvelopestructures.NucleicAcidsRes.44,D663–D668.doi:10.1093/nar/gkv1271 PubMedAbstract|CrossRefFullText|GoogleScholar Peabody,M.A.,Lau,W.Y.V.,Hoad,G.R.,Jia,B.,Maguire,F.,Gray,K.L.,etal.(2020).PSORTm:abacterialandarchaealproteinsubcellularlocalizationpredictiontoolformetagenomicsdata.Bioinformatics36,3043–3048.doi:10.1093/bioinformatics/btaa136 PubMedAbstract|CrossRefFullText|GoogleScholar Petersen,T.N.,Brunak,S.,vonHeijne,G.,andNielsen,H.(2011).SignalP4.0:discriminatingsignalpeptidesfromtransmembraneregions.Nat.Methods8,785–786.doi:10.1038/nmeth.1701 PubMedAbstract|CrossRefFullText|GoogleScholar Pfanner,N.,Warscheid,B.,andWiedemann,N.(2019).Mitochondrialproteins:frombiogenesistofunctionalnetworks.Nat.Rev.Mol.CellBiol.20,267–284.doi:10.1038/s41580-018-0092-0 PubMedAbstract|CrossRefFullText|GoogleScholar Pierleoni,A.,Martelli,P.,andCasadio,R.(2008).PredGPI:aGPI-anchorpredictor.BMCBioinformatics9:392.doi:10.1186/1471-2105-9-392 PubMedAbstract|CrossRefFullText|GoogleScholar Pierleoni,A.,Martelli,P.L.,andCasadio,R.(2011).MemLoci:predictingsubcellularlocalizationofmembraneproteinsineukaryotes.Bioinformatics27,1224–1230.doi:10.1093/bioinformatics/btr108 PubMedAbstract|CrossRefFullText|GoogleScholar Pierleoni,A.,Martelli,P.L.,Fariselli,P.,andCasadio,R.(2006).BaCelLo:abalancedsubcellularlocalizationpredictor.Bioinformatics22,e408–e416.doi:10.1093/bioinformatics/btl222 PubMedAbstract|CrossRefFullText|GoogleScholar Prieto,G.,Fullaondo,A.,andRodriguez,J.A.(2014).Predictionofnuclearexportsignalsusingweightedregularexpressions(Wregex).Bioinformatics30,1220–1227.doi:10.1093/bioinformatics/btu016 PubMedAbstract|CrossRefFullText|GoogleScholar Salvatore,M.,Warholm,P.,Shu,N.,Basile,W.,andElofsson,A.(2017).SubCons:anewensemblemethodforimprovedhumansubcellularlocalizationpredictions.Bioinformatics33,2464–2470.doi:10.1093/bioinformatics/btx219 PubMedAbstract|CrossRefFullText|GoogleScholar Savojardo,C.,Fariselli,P.,andCasadio,R.(2013).BETAWARE:amachine-learningtooltodetectandpredicttransmembranebeta-barrelproteinsinprokaryotes.Bioinformatics29,504–505.doi:10.1093/bioinformatics/bts728 PubMedAbstract|CrossRefFullText|GoogleScholar Savojardo,C.,Martelli,P.L.,Fariselli,P.,andCasadio,R.(2015).TPpred3detectsanddiscriminatesmitochondrialandchloroplastictargetingpeptidesineukaryoticproteins.Bioinformatics31,3269–3275.doi:10.1093/bioinformatics/btv367 PubMedAbstract|CrossRefFullText|GoogleScholar Savojardo,C.,Martelli,P.L.,Fariselli,P.,andCasadio,R.(2017).SChloro:directingViridiplantaeproteinstosixchloroplasticsub-compartments.Bioinformatics33,347–353.doi:10.1093/bioinformatics/btw656 PubMedAbstract|CrossRefFullText|GoogleScholar Savojardo,C.,Martelli,P.L.,Fariselli,P.,andCasadio,R.(2018a).DeepSig:deeplearningimprovessignalpeptidedetectioninproteins.Bioinformatics34,1690–1696.doi:10.1093/bioinformatics/btx818 PubMedAbstract|CrossRefFullText|GoogleScholar Savojardo,C.,Martelli,P.L.,Fariselli,P.,Profiti,G.,andCasadio,R.(2018b).BUSCA:anintegrativewebservertopredictsubcellularlocalizationofproteins.NucleicAcidsRes.46,W459–W466.doi:10.1093/nar/gky320 PubMedAbstract|CrossRefFullText|GoogleScholar Schneider,G.,Sjöling,S.,Wallin,E.,Wrede,P.,Glaser,E.,andvonHeijne,G.(1998).Feature-extractionfromendopeptidasecleavagesitesinmitochondrialtargetingpeptides.ProteinsStruct.Funct.Genet.30,49–60.doi:10.1002/(SICI)1097-0134(19980101)30:1<49::AID-PROT5>3.0.CO;2-F PubMedAbstract|CrossRefFullText|GoogleScholar Shen,Y.,Ding,Y.,Tang,J.,Zou,Q.,andGuo,F.(2020).Criticalevaluationofweb-basedpredictiontoolsforhumanproteinsubcellularlocalization.Brief.Bioinform.21,1628–1640.doi:10.1093/bib/bbz106 PubMedAbstract|CrossRefFullText|GoogleScholar Siegel,S.D.,Reardon,M.E.,andTon-That,H.(2017).AnchoringofLPXTG-likeproteinstothegram-positivecellwallenvelope.Curr.Top.Microbiol.Immunol.404,159–175.doi:10.1007/82_2016_8 PubMedAbstract|CrossRefFullText|GoogleScholar Small,I.,Peeters,N.,Legeai,F.,andLurin,C.(2004).Predotar:atoolforrapidlyscreeningproteomesforN-terminaltargetingsequences.Proteomics4,1581–1590.doi:10.1002/pmic.200300776 PubMedAbstract|CrossRefFullText|GoogleScholar Stekhoven,D.J.,Omasits,U.,Quebatte,M.,Dehio,C.,andAhrens,C.H.(2014).Proteome-wideidentificationofpredominantsubcellularproteinlocalizationsinabacterialmodelorganism.J.Proteomics99,123–137.doi:10.1016/j.jprot.2014.01.015 PubMedAbstract|CrossRefFullText|GoogleScholar Thul,P.J.,Akesson,L.,Wiking,M.,Mahdessian,D.,Geladaki,A.,AitBlal,H.,etal.(2017).Asubcellularmapofthehumanproteome.Science356:eaal3321.doi:10.1126/science.aal3321 PubMedAbstract|CrossRefFullText|GoogleScholar Uhlen,M.,Oksvold,P.,Fagerberg,L.,Lundberg,E.,Jonasson,K.,Forsberg,M.,etal.(2010).Towardsaknowledge-basedhumanproteinatlas.Nat.Biotechnol.28,1248–1250.doi:10.1038/nbt1210-1248 PubMedAbstract|CrossRefFullText|GoogleScholar UniProtConsortium(2019).UniProt:aworldwidehubofproteinknowledge.NucleicAcidsRes.47,D506–D515.doi:10.1093/nar/gky1049 PubMedAbstract|CrossRefFullText|GoogleScholar Vakser,I.A.(2020).Challengesinproteindocking.Curr.Opin.Struct.Biol.64,160–165.doi:10.1016/j.sbi.2020.07.001 PubMedAbstract|CrossRefFullText|GoogleScholar Vögtle,F.N.,Wortelkamp,S.,Zahedi,R.P.,Becker,D.,Leidhold,C.,Gevaert,K.,etal.(2009).GlobalanalysisofthemitochondrialN-proteomeidentifiesaprocessingpeptidasecriticalforproteinstability.Cell139,428–439.doi:10.1016/j.cell.2009.07.045 PubMedAbstract|CrossRefFullText|GoogleScholar vonHeijne,G.(1986).Mitochondrialtargetingsequencesmayformamphiphilichelices.EMBOJ.5,1335–1342.doi:10.1002/j.1460-2075.1986.tb04364.x PubMedAbstract|CrossRefFullText|GoogleScholar vonHeijne,G.(1990).Thesignalpeptide.J.Membr.Biol.115,195–201.doi:10.1007/BF01868635 PubMedAbstract|CrossRefFullText|GoogleScholar Wan,S.,Mak,M.W.,andKung,S.Y.(2012).mGOASVM:multi-labelproteinsubcellularlocalizationbasedongeneontologyandsupportvectormachines.BMCBioinformatics13:290.doi:10.1186/1471-2105-13-290 PubMedAbstract|CrossRefFullText|GoogleScholar Wang,X.,Zhang,J.,andLi,G.Z.(2015).Multi-locationgram-positiveandgram-negativebacterialproteinsubcellularlocalizationusinggeneontologyandmulti-labelclassifierensemble.BMCBioinformatics16:S1.doi:10.1186/1471-2105-16-S12-S1 PubMedAbstract|CrossRefFullText|GoogleScholar Xu,D.,Marquis,K.,Pei,J.,Fu,S.C.,Caʇatay,T.,Grishin,N.V.,etal.(2015).LocNES:acomputationaltoolforlocatingclassicalNESsinCRM1cargoproteins.Bioinformatics31,1357–1365.doi:10.1093/bioinformatics/btu826 PubMedAbstract|CrossRefFullText|GoogleScholar Yu,C.S.,Chen,Y.C.,Lu,C.H.,andHwang,J.K.(2006).Predictionofproteinsubcellularlocalization.ProteinsStruct.Funct.Genet.64,643–651.doi:10.1002/prot.21018 PubMedAbstract|CrossRefFullText|GoogleScholar Yu,N.Y.,Wagner,J.R.,Laird,M.R.,Melli,G.,Rey,S.,Lo,R.,etal.(2010).PSORTb3.0:improvedproteinsubcellularlocalizationpredictionwithrefinedlocalizationsubcategoriesandpredictivecapabilitiesforallprokaryotes.Bioinformatics26,1608–1615.doi:10.1093/bioinformatics/btq249 PubMedAbstract|CrossRefFullText|GoogleScholar Zhang,S.,Xia,X.,Shen,J.,Zhou,Y.,andSun,Z.(2008).DBMLoc:adatabaseofproteinswithmultiplesubcellularlocalizations.BMCBioinformatics9:127.doi:10.1186/1471-2105-9-127 PubMedAbstract|CrossRefFullText|GoogleScholar Zybailov,B.,Rutschow,H.,Friso,G.,Rudella,A.,Emanuelsson,O.,Sun,Q.,etal.(2008).Sortingsignals,N-terminalmodificationsandabundanceofthechloroplastproteome.PLoSOne3:e1994.doi:10.1371/journal.pone.0001994 PubMedAbstract|CrossRefFullText|GoogleScholar Keywords:proteinsorting/targeting,subcellularloalization,sorting/targetingsignals,predictionmethods,bacteria,archaea,eukarya Citation:ImaiKandNakaiK(2020)ToolsfortheRecognitionofSortingSignalsandthePredictionofSubcellularLocalizationofProteinsFromTheirAminoAcidSequences.Front.Genet.11:607812.doi:10.3389/fgene.2020.607812 Received:18September2020;Accepted:03November2020;Published:25November2020. Editedby:ShuaiChengLi,CityUniversityofHongKong,HongKong Reviewedby:LitaoSun,SunYat-senUniversity,ChinaMartiAldea,InstitutodeBiologíaMoleculardeBarcelona(IBMB),Spain Copyright©2020ImaiandNakai.Thisisanopen-accessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(CCBY).Theuse,distributionorreproductioninotherforumsispermitted,providedtheoriginalauthor(s)andthecopyrightowner(s)arecreditedandthattheoriginalpublicationinthisjournaliscited,inaccordancewithacceptedacademicpractice.Nouse,distributionorreproductionispermittedwhichdoesnotcomplywiththeseterms. *Correspondence:KentaNakai,[email protected] COMMENTARY ORIGINALARTICLE Peoplealsolookedat SuggestaResearchTopic>



請為這篇文章評分?