Knowledge Hub Console

知識詳情

只有在匯入完成後,文件記錄才會出現在這裡。上傳工作階段、原始檔與解析後知識條目是刻意分離的概念。

返回知識庫
使用中pdf_manualknowledge-hub-cosmx-smi-v2-2

MAN-10162-11_CosMx_SMI_Data_Analysis_User_Manual_for_v2.2.pdf

更新於 2026-03-30T15:34:42.048Z · 上傳者 codex_seed

內容

[Page 1] MANUAL CosMx® SMI Data Analysis MAN-10162-11 For AtoMx® SIP v2.2 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. Innovation with Integrity [Page 2] Contacts BrukerSpatialBiology,Inc. 3350MonteVillaParkway Bothell,Washington,USA98021 www.brukerspatialbiology.com Tel:888-358-6266 Fax:206-378-6288 TechnicalSupport:support.spatial@bruker.com CustomerService:customerservice.spatial@bruker.com SalesContacts NorthAmerica:nasales.bsb@bruker.com EMEA:emeasales.bsb@bruker.com APAC:apacsales.bsb@bruker.com Allotherregions:globalsales.bsb@bruker.com LuxendoGmbH ImBreitspiel2-4 69126Heidelberg Germany UK Rep NanoStringTechnologies,EuropeLimited Suite2FirstFloor,10TempleBack Bristol,UnitedKingdomBS16FL bnano.legal@bruker.com [Page 3] Rights,License,andTrademarks Use ForResearchUseOnly.Notforuseindiagnosticprocedures. IntellectualPropertyRights ThisCosMx®SpatialMolecularImager(SMI)UserManualanditscontentsarethepropertyofBrukerCorporation(“ Bruker” ), andareintendedsolelyforusebyBrukercustomers,forthepurposeofoperatingtheCosMxSMISystem.TheCosMxSMI System(includingbothitssoftwareandhardwarecomponents),thisUserManual,andanyotherdocumentationprovidedtoyou byBrukerinconnectiontherewith,aresubjecttopatents,copyright,tradesecretrightsandotherintellectualpropertyrights ownedby,orlicensedto,Bruker.Nopartofthesoftwareorhardwaremaybereproduced,transmitted,transcribed,storedina retrievalsystem,ortranslatedintootherlanguageswithoutthepriorwrittenconsentof Bruker.Fora listof patents,see www.nanostring.com/company/patents. LimitedLicense SubjecttothetermsandconditionsoftheCosMxSMISystemcontainedintheproductquotation,Brukergrantsyoualimited, non-exclusive,non-transferable,non-sublicensable,researchuseonlylicensetousetheproprietaryCosMxSMISystemonlyin accordancewiththemanualandotherwritteninstructionsprovidedbyBruker.Exceptasexpresslysetforthinthetermsand conditions,norightorlicense,whetherexpress,impliedorstatutory,isgrantedbyBrukerunderanyintellectualpropertyright ownedby,orlicensedto,BrukerbyvirtueofthesupplyoftheproprietaryCosMxSMISystem.Withoutlimitingtheforegoing,no rightorlicense,whetherexpress,impliedorstatutory,isgrantedbyBrukertousetheCosMxSMISystemwithanythirdparty productnotsuppliedorlicensedtoyoubyBrukerorrecommendedforusebyBrukerinamanualorotherwritteninstruction providedbyBruker. Trademarks BrukerSpatialBiology,theNanoStringlogo,CosMxandAtoMxaretrademarksorregisteredtrademarksofBrukerCorporationin theUnitedStatesand/orothercountries.Allothertrademarksand/orservicemarksnotownedbyBrukerCorporationthatappear inthisdocumentarethepropertyoftheirrespectiveowners. OpenSourceSoftwareLicenses Visithttp://nanostring.com/cosmx-ossandhttp://nanostring.com/atomx-ossforalistofopensourcesoftwarelicensesusedin CosMxSpatialMolecularImaging. Copyright ©2025BrukerCorporation.Allrightsreserved. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 3 MAN-10162-11 CosMxSMI DataAnalysisUserManual Rights,License,andTrademarks [Page 4] TableofContents CosMxSMIDataAnalysisUserManual 1 Rights,License,andTrademarks 3 TableofContents 4 ChangesinthisRevision 7 Changesinv2.2AffectingCustomer-BuiltAnalysisPipelines(afterAtoMxDataExport) 7 Conventions&Safety 8 CosMxSMI WorkflowOverview 9 CosMxSMIUserManualsandResources 10 CosMxSMIDataisManagedintheAtoMxSpatialInformaticsPlatform 11 CellSegmentation 14 CellSegmentationWorkflow 15 CellSegmentationConfigurations 20 ExamplesofAdjustmentstoParameters 22 CreateandOpenaStudy 26 DeleteaStudy 27 OrientationtoCosMxSMI DataAnalysisSuite 28 ManageAnnotations 36 RunaPipeline 37 HowtoRunDataAnalysisPipelinesThatRequireParameterInputs 40 CustomModules 41 RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data 43 CosMxSMIPipelineModules 46 InitialData 46 QualityControl-RNA 47 QualityControl-Protein 50 GeneSelection-RNA 51 Normalization-RNA 51 Normalization-Protein 53 PrincipalComponentAnalysis(PCA)-RNA orProtein 54 UMAP-RNAorProtein 54 CellTyping(InSituType)-RNA 56 4 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual TableofContents MAN-10162-11 [Page 5] ExpressionModel(CELESTA)-Protein 59 CellTyping(CELESTA)-Protein 59 CellTypeQC-RNA 60 NeighborNetwork:ExpressionSpace-RNAorProtein 61 LeidenClustering-RNA orProtein 61 IdentifyMarkerGenes-RNAorProtein 62 NeighborhoodAnalysis-RNA orProtein 63 Ligand-Receptor(LR)Analysis-RNA 63 SpatialNetwork-RNA orProtein 64 CellTypeCo-Localization-RNAorProtein 64 PathwayAnalysis-RNA 65 SpatialExpressionAnalysis-RNA 65 DifferentialExpression(DE)-RNA 66 Novae-RNA(SpatialDiscoverystudiesonly) 68 SpatialDiscovery-RNA(SpatialDiscoverystudiesonly) 69 CosMxSMIDataVisualizations 70 StudyStatisticsTable 70 QC MetricsTable 70 XYPlot 71 Heatmap 71 BoxPlotandViolinPlot 72 Histogram 72 PCA Plot 73 UMAPPlot 73 VolcanoPlot 74 FlightpathPlot 74 SaveaVisualization 75 ExportImages 76 ExportData 77 WorkingwithExportedData 81 AppendixI: LiteratureReferences 86 AppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA) 87 AppendixIII: SetuptoExportDatatoanAWS S3Bucket 90 AppendixIV:DownloadCosMxSMIFilesFromS3BucketAfterExport 99 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 5 MAN-10162-11 CosMxSMI DataAnalysisUserManual TableofContents [Page 6] TroubleshootingandTechnicalSupport 102 Finalpage 104 6 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual TableofContents MAN-10162-11 [Page 7] ChangesinthisRevision ThisusermanualhasbeenupdatedtoreflectnewfeaturesofCosMxSMI DataAnalysisversion2.2: l TransferflowcellownershiptootherAtoMxSIPaccountholders,describedonpage 12 l Openaflowcellimageinanewtabofthebrowserwindow,describedonpage 12 l Updatedcellsegmentationusecasesformuscle(cross-sections,longitudinalsections,andheart)andexamples forresegmentation,beginningonpage 20 l AddedinformationaboutanexperimentalnewSpatialDiscoverystudyconfiguration(page26),visualization-first featuresanddownloadpackageforLLManalysis(pages28-29),anddataanalysispipeline(page37) l UpdatedtheRecommendationsforAnalyzingWholeTranscriptome (WTX) Datawithmoredetailoncalculating SNRandincorporatingtheGeneSelectionmodule,onpage 43 l Addedinformationaboutnewdataanalysismodulesin theSpatialDiscoverypipeline,NovaeandSpatial Discovery,onpage 68 l CorrecteddescriptionofFOV Positionsflatfile(coordinatesrefertotopleftofFOVratherthancenter)onpage 78 l AddedtoTroubleshootingsectiononpage 102 l Minoreditsforclarification Changesinv2.2AffectingCustomer-BuiltAnalysisPipelines(after AtoMxDataExport) Pleasebeawareofthesechangesinv2.2thatmayrequireeditstolocal,customer-builtanalysispipelines: l Cellmetadatacolumnswillalwayscontain"."insteadof"-"regardlessofthemodulesrunpriortoexport. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 7 MAN-10162-11 CosMxSMI DataAnalysisUserManual ChangesinthisRevision [Page 8] Conventions&Safety Thefollowingconventionsareusedthroughoutthismanualandaredescribedforyourreference. Boldtextistypicallyusedtohighlightaspecificbutton,keystroke,ormenuoption.Itmayalsobeusedtohighlight importanttextorterms. Blueunderlinedtextistypicallyusedtohighlightlinksand/orreferencestoothersectionsofthemanual.Itmayalso beusedtohighlightreferencestoothermanualsand/orinstructionalmaterial. Thegrayboxindicatesgeneralinformationthatmaybeusefulforimprovingassayperformance.Thenotesmay clarifyotherinstructionsorprovideguidancetoimprovetheefficiencyoftheassayworkflow. WARNING: Thissymbolindicatesthepotentialfor bodilyinjuryor damageto theinstrumentif the instructionsarenotfollowedcorrectly.Alwayscarefullyreadandfollowtheinstructionsaccompaniedby thissymboltoavoidpotentialhazards. IMPORTANT: Thissymbolindicatesimportantinformationthatiscriticaltoensureasuccessfulassay. Followingtheseinstructionsmayhelpimprovethequalityofyourdata. Safety WARNING: ReadtheSafetyDataSheets(SDS)andfollowthehandlinginstructions.Wearappropriate protectiveeyewear,clothing,andgloves.SDSareavailablefromhttps://nanostring.com/resources/safety- data-sheets. 8 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual Conventions&Safety MAN-10162-11 [Page 9] CosMxSMI WorkflowOverview Figure1:CosMxSMI workflowoverview Day1: SlidePreparation.PrepareslidesmanuallyorusingtheBOND RX/RXmfullyautomatedIHC/ISH stainer fromLeicaBiosystems(BOND RX/RXm). Day2: ProcessSlidesonCosMxSMI.Completeassayandassembletheflowcells.Loadassembledflowcells intotheCosMxSMI instrumentandenterflowcell/studyinformation.TissueisscannedtocaptureRNA orProtein readoutandmorphologyimagingwithinuser-designatedfieldsofview(FOVs). Afterruncompletion:CreateaDataAnalysisstudyintheAtoMx®SpatialInformaticsPlatform(SIP)andperform quality-controlchecks,dataanalysis,andgenerateanalysisplots. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 9 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMI WorkflowOverview [Page 10] CosMxSMIUserManualsandResources TheCosMxSMI workflowisdividedintothefollowingusermanuals:  Step RNA Protein Multiomics Prepare Slides CosMx SMI ManualSlide Preparation forRNAAssays MAN-10184 CosMxSMISemi- AutomatedSlide Preparation forRNAAssays MAN-10186 CosMxSMI Manual SlidePreparation forProteinAssays MAN-10185 CosMxSMISemi- AutomatedSlide Preparation forProteinAssays MAN-10187 CosMxSMIManual SlidePreparationfor MultiomicAssays MAN-10201 Processon CosMxSMI CosMxSMI InstrumentUserManual MAN-10161 Data Analysis CosMxSMI DataAnalysisUserManual MAN-10162 UsermanualsandotherdocumentscanbefoundonlineintheNanoUDocumentLibrary.Instrumentandworkflow trainingcoursesarealsoavailableinNanoU. For informationabout the AtoMx SpatialInformaticsPlatform,pleaserefer to the AtoMx SIP Platform AdministrationManual(MAN-10170). AdditionaldataanalysissupportandresourcescanbefoundintheCosMxAnalysisScratchSpace. 10 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIUserManualsandResources MAN-10162-11 [Page 11] CosMxSMIDataisManagedintheAtoMxSpatialInformatics Platform Afterprimaryprocessingon-instrument,datafromtheCosMxSpatialMolecularImager(SMI)isautomaticallysent to thecloud-basedAtoMxSpatialInformaticsPlatform(SIP)forfurtherprocessing,storage,management,and collaboration.CosMxSMI dataanalysisis performedwithinAtoMxSIPintheCosMxSMI DataAnalysisSuite (Figure2). Figure2:TheAtoMxSIP cloudplatformhousestheCosMxSMI DataAnalysisSuite ThisCosMxSMI DataAnalysisUserManualoutlinesthestepstoanalyzedataintheDataAnalysisSuiteusingthe pipelineorchestratorandanalysismodules.LearnmoreaboutspatialdataanalysisthroughNanoUandCosMx AnalysisScratchSpace.DataanalysisservicesarealsoavailablethroughBruker(seeSpatialDataAnalysisService). ConnectingCosMxSMI toAtoMxSIP ForCosMxSMI instrumentowners,theinstrumentisregisteredtoAtoMxSIPduringinstallationorusertraining. Onceregistered,datatransferfromtheinstrumenttoAtoMxSIPoccursautomatically.RefertotheCosMxSMI Site PreparationGuide(MAN-10171)andCosMxSMIInstrumentUserManual(MAN-10161)foradditionalinformation andtoaddresssituationsthatrequiremanualtransferorintervention. AccessingAtoMxSIP LogintoAtoMxSIPatyourorganization'scustomURL. RefertotheAtoMxSIPPlatform AdministrationManual (MAN-10170)formoreinformationaboutaccountset-up. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 11 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIDatainAtoMxSIP [Page 12] OrientationtoAtoMxSIP TheAtoMxSIPHomepageandkeyfeaturesareshownbelow(Figure3).FormoredetailsaboutAtoMxSIP(outside oftheDataAnalysisSuite),pleaserefertotheAtoMxSIPPlatform AdministrationManual(MAN-10170). Figure3:Homepagewithkeycomponentslabeled Fromtheleftmenu,navigatewithinsectionsoftheplatform.PleasenotethatonlytheHomepageisvisibleto ExternalUsers(AtoMxSIPusersoutsideanOrganizationalAccount-seetheAtoMxSIPPlatform Administration Manual(MAN-10170)formoreinformation). ThegalleryisaccessiblethroughtheHomepage,anddisplaysthedatacollections,studies,andCosMxSMIflow cellsonindividualcards.Themostrecentlyaccessedobjectsaredisplayed.ClickonViewAlltoviewallobjects. Collectionsaregroupsofdataobjects(flowcellsandstudies).Individualobjectscanbepartofmorethanone collection.Deletingacollectiondoesnotdeletetheindividualobjectsinthecollection. LocatespecificobjectsofinterestusingtheSortBytool(topright),theGlobalContentFilter(checkboxesatleft), typingtermsintothesearchbar,orclickingthefiltericontoopenafilteringpane. Clickthecaratonacardtoexpanddetailsabouttheflowcellstatus(Table1).Clickthe3 dots( )onacardfor optionsincludingView,Openin newtab,Addto collection,Viewcomments,Viewdetails, andMore options(Delete). Selectastudy(usingitscheckbox)toactivatetheSharebutton.SelectaflowcelltoactivatetheTransferbutton. OnlyflowcellswithstatusReadyforStudyCreationcanbetransferred.Moredetailsaboutsharingstudiesand transferringflowcellsareintheAtoMxSIP PlatformAdministrationManual(MAN-10170). 12 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIDatainAtoMxSIP MAN-10162-11 [Page 13] Flowcellstatus Definition Created AflowcellrecordhasbeencreatedinAtoMxSIPandisawaitingthedatauploadprocessfrom theCosMxSMIinstrument. Inprogress/Data upload Uploadandingestionofdatainprogress.Itcantakeupto48hrafterruncompletionfordatato finishuploadingtoAtoMx. Inprogress/ Imageprocessing Imagestitchinginprogressforthelayersofthedisplayimageintheviewer.Includesthe previewscan,morphologyimage,andproteinimages(ifapplicable). Inprogress/ Targetdecoding Targetdecodinginprogress. Readyforstudy creation Priorstepsarecomplete;flowcellisreadyforanalysis. Inprogress/ Resegmentation Segmentationinprogress;theflowcellcannotbeusedforanalysisuntilsegmentationis complete. Readyforstudy creationwitherror Flowcellencounteredanerrorduringsegmentation.Flowcellisreadyforanalysiswiththe previoussegmentationresults.Re-tryingsegmentationmayresolvetheissue. Error Anerrorindataupload,imageprocessing,ortargetdecodinghasoccurred.Clickonthe3dots, thenViewdetailsformoreinformationabouttheerror. Table1:Flowcellstatusdefinitions.Clickthecarattoseedetailsof"inprogress". CosMxSMIDataIngestionBestPractices l ForimageprocessingerrorsthatoccurduringaCosMxSMIrun,pleaseallowtheruntocompleteandperform thepost-runcleaningbeforecontactingSupportatsupport.spatial@ bruker.com.Thisensurestheinstrumenthas completedtheworkflowbeforeanyintervention.Itwillnotaffectdatatransferorresultinlostdata. l Duringthepost-runcleaningonCosMxSMI,amessageindicates"Datauploadinprogress".Thesearethe cleaninglogsbeinguploadedtoAtoMxSIP. l Pleaseallowupto48hrafterruncompletionforflowcelldatatofinishuploadingtoAtoMxSIP.Ifflowcellstatus checkmarks(Figure3)areincompleteoranerrorpersistsafter48hours,contactsupport.spatial@ bruker.com. l Topreventdataloss,resolveanydataingestionissuesbeforestartinganewCosMxSMIrun. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 13 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIDatainAtoMxSIP [Page 14] CellSegmentation Accuratecellsegmentationisthefoundationofmeaningfulspatialbiologydata.AnoverviewoftheAtoMxSIPcell segmentationpipelineisshowninFigure4. Eachz-stackhas5channelimageswithdifferentmorphologymarker signals.TheimagesareprojectedandprocessedintoonesingleimageforeachFOV.Pre-processingimprovesthe imagequality.Themorphologymarkersignalsareprocessedby machine-learning(ML)augmentedcell segmentationtodefinethesoma(cellbody)andnuclearlabels.Themachine-learningoutputisprocessedintofinal celllabelstoreflectthecellsandtheircompartmentsforapplicationindownstreamanalyses. Figure4:CellsegmentationpipelineinAtoMxSIP. ThecellsegmentationoptionsinAtoMxSIPenableresearcherstoevaluatethesegmentationinitiallyperformedon theirflowcellimageand(ifdesired)changetheconfigurationsettingstooptimizesegmentation.Researcherscan alsoapplydifferentsegmentationconfigurationstodifferentFOVsacrossaflowcelltoaccommodatetissuearrays orotherscenariosrequiringFOVcustomization. SegmentationonAtoMxSIP(ifdesired)isperformedafterthedecodeddataistransferredfromthe CosMxSMIinstrumenttoAtoMxSIPandbeforedataanalysisstudiesarecreated.Segmentationcanbe performedmultipletimesin aniterativeprocessto refineandoptimizethecellboundariesfortheparticular biologicalsampleorstudy.Segmentationprofilesarenamedandcarryversionnumbers.Onceadataanalysisstudy iscreatedusingaparticularsetofsegmentationparameters,imagescanbesegmentedagainbyexitingthestudy andfollowingthestepsbelow.Thenewsegmentationdatawillnotoverwritetheexistingstudy,butnewstudies createdwillreflectthelatestsegmentationdata. 14 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CellSegmentation MAN-10162-11 [Page 15] CellSegmentationWorkflow 1. FromtheAtoMxSIPgallery,clickonaflowcelltoopenitintheImageViewer.Onlyflowcellswiththestatus Readyforstudycreationareeligibleforsegmentation. 2. FromtheFOVstab,clicktheFlowCellProfileNametoviewcurrentsegmentationparametersand/orclick SegmentCellstoopentheCellSegmentationOptionswindow(Figure5). Figure5:CellSegmentationOptions(center)andCurrentSegmentationViewDetails(right) 3. FillinaFlowCellProfileName(namingthesetofparameterswhichwillbedefinedinthefollowingsteps,e.g., TonsilSegmentation4).Donotuseacommaintheflowcellprofilename. 4. FromtheFOV(s)dropdown,selecttheFOV(s)forthissegmentation.(SelectoneorafewFOVstoapplythe segmentationinitially.Subsequently,theprofilecanbeappliedtomoreFOVsontheflowcell.) 5. FromtheConfigurationdropdown,selecttheConfigurationtoserveasthebasisforthesegmentation.See examplesinCellSegmentationConfigurationsonpage20. 6. (Optional)AdjustparametersintheBasictab(Table2).SeeExamplesofAdjustmentstoParametersonpage22. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 15 MAN-10162-11 CosMxSMI DataAnalysisUserManual CellSegmentation [Page 16] Basic Parameter Description Nucleus Diameter (µm) How the cell mask is adjusted to accommodate nuclei of different sizes. Increase the value from the default to include more area; decrease to reduce the nuclear area. Cytoplasm Diameter (µm) Aids the segmentation algorithm in resizing the image before segmentation is run or re-run. Enter a value that represents the average cell diameter in this sample or region of sample, or iterate based on previous segmentation results (if cells appear over-split, increase cell diameter; if cells appear over-merged, decrease cell diameter). Final Cell Label Min Size Threshold for filtering cell structures based on minimum required size. Final Dilation /Erosion (µm) Adjusts the size of the cell mask, e.g. to include or exclude more transcripts that are located at the edges (that is, on top of the cell membrane). Increase the value from the default to include more area; decrease to reduce the included area. Dilation will automatically stop at boundaries between cells. Table 2: Basic parameters in cell segmentation. 7. In the Advanced tab, targets and channels can be customized (Figure 6). For example, a channel with misleading staining (blurry, dim, or high autofluorescence) can be turned off, or an a la carte marker's channel can be turned on. Configurations A, C, D, E, F, G must have one channel for nuclei selected. Configuration B must have the B and/or U channel set to nuclei; other channels of Configuration B cannot be customized at this time. Configuration F targets and channels cannot be customized at this time. Other parameters in the Advanced tab can be adjusted (see Table 3); however, most samples do not require changes to Advanced parameter default values. The values are editable for the rare situations where adjustments to the basic parameters cannot accurately capture the sample's cell boundaries. If changes are needed, it is recommended to edit only Model, Probability Threshold, and/or Flow Threshold under Cell Options or Nuclei Options. Refer to the Examples of Adjustments to Parameters on page 22. Figure 6 : Advanced parameters enable custom selection of nuclei, morphology, or other targets by channel. 16 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Cell Segmentation MAN-10162-11 [Page 17] Advanced Parameter Description Include Mask Processing Check box to include the 'post-ML processing' step as depicted in Figure 4. The only situation in which you may want to exclude mask processing is if the staining is poor or image is blurry, such that the software lacks good data for processing. Foreground Threshold A measure of the probability that a certain area belongs to a certain cell. Default 0, range -6 to 6. A more positive number increases the stringency and includes less area. A more negative number includes more area. Nearest Neighbor Split True/False. Whether to assign a weak protrusion to the nearest neighbor cell. Min Diam Ratio Default 0.25, meaning that final cell segments will not be smaller than 0.25* (Diameter indicated in the selected segmentation configuration). See Configurationson page 20 Max Diam Ratio Default 4, meaning that final cell segments will not be larger than 4* (Diameter indicated in the selected segmentation configuration). See Configurationson page 20 Nucleus Label Model Segmentation model to employ. Default is bsbNuc. For details on other models from Cellpose, see https://cellpose.readthedocs.io/en/latest/models.html Nucleus Probability Threshold Threshold to filter out pixels of small nucleus probability that correspond to non-nuclei. When staining is poor, lowering this value to the minimum of -6 can pick up more nuclei, but results may be more noisy. Nucleus Flow Threshold Threshold used to filter out magnitude of flows that correspond to small debris/non-nuclei. When staining is poor, lowering this value can pick up more nuclei signal, but results may be more noisy. Cytoplasm Label Model Segmentation model to employ. Default is bsbCyto or bsbNeuro3Ch. For details on other models from Cellpose, seehttps://cellpose.readthedocs.io/en/latest/models.html . Cytoplasm Probability Threshold Threshold to filter out pixels of small cell probability that correspond to non-cells. When staining is poor, lowering this value to the minimum of -6 can pick up more cells, but results may be more noisy. Cytoplasm Flow Threshold Threshold used to filter out magnitude of flows that correspond to small debris/non-cells. When staining is poor, lowering this value can pick up more cells, but results may be more noisy. Final Cell Label Max Size When segmenting is complete, no cell segment (area) will be larger than this value. Prefer Cytoplasm Over Nucleus True/False. TRUE will favor the cytoplasm-based segmentation over the nuclei-based segmentation calls. Mononuclear Cell True/False. If one cytoplasm covers multiple nuclei, TRUE will split the cytoplasm to yield mononuclear cells whereas FALSE will leave the cytoplasm with multiple nuclei. Table 3: Advanced parameters in cell segmentation 8. When the parameters are set, clickApply to apply the new segmentation profile to the selected FOVs. 9. A dialog indicates that the flow cell is being segmented. Other users viewing the flow cell tile in the gallery see the status change from Ready for study creation to Resegmenting, indicating that the flow cell is being processed and is not available for data analysis. Segmentation takes 2-3 minutes per FOV. Any errors will be FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 17 MAN-10162-11 CosMx SMI Data Analysis User Manual Cell Segmentation [Page 18] reportedinadialogonthescreen. 10. Whencomplete,thenewcellsegmentationisdisplayedintheImageViewer,withtheprofilenameandversion listedintheFOVspanel(referbacktoFigure5)andtheImage Viewerlegend(Figure7). Figure7:SegmentationdetailsinImageViewerlegend 11. Next,youcan: l FurthermodifythecurrentsegmentationprofilebyadjustingparametersandclickingApply. Thenew parameterswillbedefinedasversion2,thenversion3,etc.Toviewversiondetails(suchascreator,date created,configuration,FOVs,andparameters),clicktheVersionlinkintheFOVspanel(seelinkAllFOVs(16)in Figure5). l ApplythecurrentsegmentationprofiletoadditionalFOVsintheflowcellbyselectingadditionalFOVsfrom thedropdownandclickingApply. l SegmentotherFOVwithdifferentconfigurations,ifdesired,byselectingthenewFOVfromthedropdown andsettingtheconfigurationandparametersasdesired.Figure8 givesanexampleof adjacentFOVwith differentsegmentationconfigurations. Figure8:DifferentFOVscanusedifferentsegmentationconfigurations. 12. Oncesegmentationiscomplete,theflowcellisreadyfordataanalysis.Thesegmentationprofile(s)ineffectfor thisflowcellaredisplayedontheCreateStudywindow(Figure9)andintheflowcellspaneloftheCosMx SMI DataAnalysisSuite(Figure10). 18 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CellSegmentation MAN-10162-11 [Page 19] Figure9:SegmentationprofilesforselectedflowcellsareshownatCreateStudystep Figure10:Segmentationprofilesapplicabletoastudy'sFOV areshownintheFlowcellspaneloftheSMI DataAnalysisSuite Oncetheflowcellhashadastudycreatedfromit,itcanstillbesegmented(followingthestepsabove,starting fromthegallery).Existingstudieswillnotbeoverwritten,butnewstudieswillusethenewsegmentation profile.WhenaflowcellisopenedintheImageViewerfromwithinadataanalysisstudy,theversionofthe segmentationprofilethatwasusedforthisstudyisdisplayedintheFOVspanelandinthelegend. Cellsegmentationprofilesappliedtoaflowcellaresavedintheflowcellmetadata,whichcanbeexported throughthebuilt-in exportfunctionorthecustomexportmodule.Intherawdecodeddataoutput,folder CellStatsDircontainsa folder for each versionof segmentationapplied,and within those, SegmentationManifest_Parameters_[SetID]_[version].jsoncontainsthecellsegmentationsetinformationfor everyresegmentedFOV. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 19 MAN-10162-11 CosMxSMI DataAnalysisUserManual CellSegmentation [Page 20] CellSegmentationConfigurations Configuration A Defaultfornon-neuralhumantissues CellDilation(µm) 2.16 CellDiameter(µm) 8.28 NucleusDiameter(µm) 7.2 HumanBreastTissue Configuration B FormouseandhumanneuraltissuewithRNA assays CellDilation(µm) 0.72 CellDiameter(µm) 14.4 NucleusDiameter(µm) 10.8 HumanNeuralTissue(FFPE)(RNAassay) Configuration C Fornuclei/cellsslightlylargerthanregularhumancells (e.g.cellpelletarraysandsomeculturedcells) CellDilation(µm) 4.32 CellDiameter(µm) 14.4 NucleusDiameter(µm) 10.8 CellPelletArray Configuration D Fortissuecontainingverylargecelltypes (e.g.megakaryocytesorosteosarcoma) CellDilation(µm) 4.32 CellDiameter(µm) 36 NucleusDiameter(µm) 18 HumanOsteosarcoma 20 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CellSegmentation MAN-10162-11 [Page 21] Configuration E Forheartmuscleorcombinationsoflargeandsmallcells, e.g.hepatocyteswithsmallerimmunecellsinliver CellDilation(µm) 4.32 CellDiameter(µm) 29.28 NucleusDiameter(µm) 5.2 HumanHeartMuscle Configuration F Formouseneuraltissuewithproteinassays CellDilation(µm) 0 CellDiameter(µm) 16.56 NucleusDiameter(µm) 6.2 MouseNeuralTissue(Proteinassay) ConfigurationG Developedwithlamininmarker.Formuscleandmulti- nucleatedcells.Optimizedforcross-sections.See Example6onpage 24foradjustmentsforlongitudinal sections. CellDilation(µm) 2.16 CellDiameter(µm) 5.2 NuclearDiameter(µm) 30 MusclewithImmuneCells FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 21 MAN-10162-11 CosMxSMI DataAnalysisUserManual CellSegmentation [Page 22] ExamplesofAdjustmentstoParameters Example1:SinceDAPIsignalislowandmissinginsomecells,segmentationreliesheavilyonthemembrane signal.Somecellsarelarger,sothesegmentationconfigurationischangedfromConfigurationAtoD.Thecell modelischangedfromthedefaultCPtocyto2,whichbetterfitstheelongatedshapesinthemembraneareas andlowDAPIsignal. Original:ConfigurationA Adjusted:ConfigurationD+cyto2model Example2:Normalkidneytissue;somecellshavelittleboundaryinformation.ByreducingtheCellProbabilityand FlowThresholds,morecellsaresegmentedthatweremissedinthedefaultconfiguration(markedinredonleft). Original:ConfigurationA Adjusted:CellProbabilityThreshold=-6; FlowThreshold=0 22 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CellSegmentation MAN-10162-11 [Page 23] Example3:Livertumortissue;originalConfigurationEcontainsdilationof4.32µm.Reducebyhalfto2.16µmto tightentheareaaroundeachcell. Original:ConfigurationE Adjusted:Reducedilation Example4:Out-of-focusmembranesignalthatoughttobeincludedinthesegmentation-increasedilation parameterfromthedefault. Original:ConfigurationA Adjusted:Increasedilation FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 23 MAN-10162-11 CosMxSMI DataAnalysisUserManual CellSegmentation [Page 24] Example5:Whenstainingispoorand/orcellsarenotinfocus,theCyto2modelmayperformbetterthanthe defaultCPmodel.Inthisexample,thecellprobabilityandflowthresholdarealsoreducedtoimprove segmentation. Original:CPmodel Adjusted:Cyto2model Example6.Segmentationofdifferenttypesofmusclecell.Formusclecross-sections,ConfigurationGis recommended,andsegmentationmaybeimproveduponbydesignatingPanCKandCD298/Membranemarkers asmembranemarkersduringresegmentationinAtoMxSIP.Insomecases,autofluorescenceinthosechannels mayaddusefulsegmentationinformation. Forsampleswithlongitudinalmusclesections,oracombinationoflongitudinalandcross-sections,itis recommendedtoselectConfigurationGon-instrument,thenresegmentinAtoMxSIPwiththeConfigurationG baseparameter,adjustingthenucleusdiametertobetwicethedefaultvalue,andselectingtheCyto2nuclei model. Original:ConfigurationG Adjusted: Nucleidiameter=60μm,Cyto2model 24 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CellSegmentation MAN-10162-11 [Page 25] Example 7. For heart muscle, select Configuration G or E on-instrument, then resegment in AtoMx SIP, excluding PanCK and CD45 in the segmentation advanced parameters (see page 16). (Note that both Configuration G and E perform well if laminin is used as a marker with heart muscle. If laminin is not used, Configuration E tends to perform better.) Original: Configuration E Adjusted: Exclude PanCK and CD45 channels FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 25 MAN-10162-11 CosMx SMI Data Analysis User Manual Cell Segmentation [Page 26] CreateandOpenaStudy Studiescanbecreatedfromdatathathascompletedtransfer(upload)fromtheinstrument.Onlyflowcellsrun withthesamepanelcanbecombinedinonestudy. 1. FromtheAtoMxSIPgallery,clickthecheckboxonthecard(s)oftheflowcellsofinterest.Iftheobjectismarked Inprogress,it isnotyetavailableforstudycreation.TheCreateStudybuttoninthetop-rightbecomesactive (Figure11). Figure11:SelectaflowcelltoactivatetheCreateStudybutton 2. ClickCreateStudy. 3. In theCreateStudywindow,entertheStudyName,Description,andTagsto annotatethestudy.For RNA studies,startinginAtoMxSIP v2.2,selectwhethertoruntheclassicconfigurationorSpatialDiscovery configuration.(SpatialDiscoveryis anexperimentalvisual-firstinterfacethatautomaticallyrunsa Spatial Discoveryanalysispipeline,describedonpage 37, andpresentsinteractivevisualizationsuponopeningthe study.)ClickCreate. Duringstudycreation,flowcelldataareaggregatedandorganizedtoadataobject.Thisprocesstakeslongerfor studieswithmoreflowcellsandFOV.WhenaSpatialDiscoverystudyiscreated,adefaultpipelineisalsorun, sothisprocesstakeslongerthanclassicstudycreation. Toopenthisoranyexistingstudy,navigatetoyour studiesintheHomegalleryandclickonthestudytoopenthe CosMxSMI DataAnalysisSuite. Astudycreationlogrecordsthesoftware'sstepsin creatingthestudy.Ifneededfortroubleshooting,download thestudycreationlogfromthePipelineStructurepanel(prior toanypipelinerun)ortheStudyDetailsPanel(Figure12)of theDataAnalysisSuite. Figure12:Downloadstudycreationlogs 26 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual Create/ OpenaStudy MAN-10162-11 [Page 27] DeleteaStudy Todeleteastudy: 1. Clickthe3-dotsmenu( )onthestudy'scardinthegalleryandselectMore Options,thenDelete(Figure13). 2. Aconfirmationwindowopenstoconfirmdeletingthestudyfromthegallery.If confirmed,thestudyismovedtoDeletedObjects,accessiblefromthemenuatleft. 3. Toreviewdeletedobjects,clickDeletedObjects(Figure14).Objectsmaybe restoredtothegalleryorpermanentlydeleted,dependingonthepermissionlevel oftheuser,bycheckingtheobject'scheckboxandselectingabuttoninthetop right. Deletingacollectiondoesnotdeletetheindividualitemsinthecollection. Figure14:DeletedObjectsfolderandDelete/Restorebuttons Figure13:Deleteastudy FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 27 MAN-10162-11 CosMxSMI DataAnalysisUserManual DeleteaStudy [Page 28] OrientationtoCosMxSMI DataAnalysisSuite CosMxSMI studiescreatedwithSpatialDiscoveryconfigurationopenintheSpatialDiscoveryview(Figure15). Studiescreatedwithclassicconfigurationopenin theclassicview(Figure16). Seestudyconfigurationson page 26. Figure15:SpatialDiscoveryviewintheCosMxSMI DataAnalysisSuite. Figure16:ClassicstudyviewintheCosMxSMI DataAnalysisSuite. Differentpanels,describedinthefollowingpages,canbedisplayedorhiddentoprovideacustomizedworkspace. Choosefromtheavailablepresetviews,orcustomizebyclickingAddpanels.Youmayneedtoscrolltotherightto viewalldisplayedpanels.Minimizeormaximizeapanelusingcontrolsinitstopright.Clickanddragpanelsto arrangetheviewtoyourliking,thensaveyourcustomizedviewbyclickingSaveview. 28 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual OrientationtoCosMxSMI DataAnalysisSuite MAN-10162-11 [Page 29] Spatial Discovery View: Data Overlay Upon opening a study in Spatial Discovery view, cells are colored by niche based on the results of the foundational pipeline that has run in the background on study creation (specifically, the results of the Novae or Neighborhood Analysis module; see page68 ). The default selections and coloring options can be changed using the controls in the Data Overlay panel ( Figure 17). Instead of coloring by niche, color cells by cell types, expression of a target, morphology markers, pathways, or total count. Under theFeature dropdown, select the features to color (such as cell type, target, marker, niche, etc. depending on the coloring scheme). The Levels dropdown sets the number of Novae niches to display. Toggle on transcripts display by clickingTranscriptsat the top of the Image Viewer (refer back to Figure 15). Once transcripts are displayed, their view settings can be edited. Use the Cells to display filter to select a subset of cells to display in the image. For example, to show only cell types identified as Cluster 2 in the Leiden Clustering module (Figure 17), toggle to Filtered; select Variable: Cell types ; select Source: Leiden Clustering; and selectFeature: 2. Only those cells are displayed. Figure 17: Data Overlay panel in Spatial Discovery view Click Spatial Discovery at the bottom of the Data Overlay panel to launch Spatial Discovery Study Insights (Figure 18). This downloadable package includes a curated set of outputs from the current study, designed for use with your preferred Large Language Model (LLM). A ReadMe file is included, with details about the file contents, and users are also directed to the Guide to Using the Spatial Discovery Output with LLMs hosted in ScratchSpace. Figure 18: Launch Spatial Discovery Study Insights Spatial Discovery View: Study Data The Study Data panel displays the Study Details and the Flow Cell and Pipeline Run panels. These panels are also a part of the classic study view and are described on the following pages. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 29 MAN-10162-11 CosMx SMI Data Analysis User Manual Spatial Discovery View: Data Overlay, Study Data [Page 30] Study Details Panel Includes information about the study, including the number of Fields of View (FOV) analyzed, the number of cells and genes detected, and the version of AtoMx in which the study was created (Figure 19). ClickDetails and logs to access more study information. ClickExport to launch the Export Dataset dialog (seeExport Data on page 77) orShow Export Details to review a completed export job. Flow Cells Panel Indicates which FOV is under analysis and the applicable cell segmentation profile (Figure 20). Click thepencil icon to open the Manage Annotations window (see Manage Annotations on page 36). Click theSearch (magnifying glass) icon to search for a particular FOV by name. For a particular flow cell, click the arrow (carat) to view a list of the FOVs in the flow cell.Uncheck the box next to an FOV to de-select it from the Image Viewer panel. Pipeline Run Panel Displays executed pipelines (Figure 21). Click the arrow (carat) to display modules comprising the pipelines. Click the pipeline icon to run a new pipeline (seeRun a Pipeline on page 37). Click the custom modules icon to open Custom Modules (see Custom Modules on page 41). Click the trash icon to delete the pipeline from the Pipeline Run panel. Pipeline run names must begin with a letter, not a number. Figure 19: Study Details panel Figure 20: Flow Cells panel Figure 21: Pipeline Run panel 30 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Study Details, Flow Cells, and Pipeline Run Panels MAN-10162-11 [Page 31] Pipeline Structure Panel Presents a diagram of the data analysis pipeline selected in the Pipeline Run panel. Individual pipeline modules are depicted as blocks (Figure 22) with icons representing the following tools: Information about the module Show resource metrics (execution time, peak memory, average memory, peak CPU, average CPU) with option to download metrics and logs Mark for interactive run (for custom modules with live interaction; opens live terminal while module runs) or to disable interactive mode Rerun step Show settings (description, input parameters, output visualizations) Download files associated with module output (if available) Numbers on the module blocks indicate the order of operations in the pipeline run. When a workflow is running, its status is depicted at the bottom of the Pipeline Structure panel. Completed modules are displayed with a check mark, while modules in progress are striped and modules pending an upstream step are marked Pending. Modules which failed are red. For descriptions of each module, please seeCosMx SMI Pipeline Modules on page 46. Pipeline Data Panel Displays data visualizations for the module selected in the Pipeline Structure panel. Available visualization types vary depending on the module run. For example, the Normalization module enables Study Statistics table, XY plot, Heatmap, Box plot, Violin plot, and Histogram (Figure 23) while the PCA module enables Study Statistics table and PCA plot. For descriptions of all visualization types and their customizable settings, please seeCosMx SMI Data Visualizationson page 70. Figure 22: Pipeline Structure panel Figure 23: Pipeline Data panel - honeycomb plot example FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 31 MAN-10162-11 CosMx SMI Data Analysis User Manual Pipeline Structure Panel, Pipeline Data Panel [Page 32] ImageViewerPanel Displaysthe tissueimageandcellsegmentation (Figure24),withtheoptiontooverlaydata.FOVsare borderedbywhiteboxes. Togglethe minimapandchannellegenddisplay. Modifythedisplaycolorsbyhoveringonthechannel name,thenclickingthepeniconthatappears.Pan acrossthe imageby clickinganddraggingin the minimapormain(larger)image.Usethezoomcontrol buttons(+–)ormousescrollwheeltozoom.Expand tofullscreenbyclickingtheiconinthetop-rightofthe ImageViewerpanel. Figure24:ImageViewerpaneldisplaystissueandspatialdata. Selecttheflowcellto displayfromthedropdown menu.ViewtheentireimageorselectspecificFOV. Selectdatatooverlay(cellsand/ortranscripts).Upto 25genes'transcriptscanbedisplayedatonce.Select fromadditionaldisplayoptions:forCellsoverlay,color by CellTypes,Expression,MorphologyMarkers, Pathways,orTotalCount;selectwhichpipelinestep's datato display,andwhichcelltype(s)orgene(s)to display.ForTranscriptsdata,selectwhichtarget(s)to display(upto25).Seeexamplesofthesedataoverlay optionsinFigure24throughFigure29. Clickthecolor palettetoadjustthedisplaycolorsorassigncustom colors. Clickthemenuiconin topleft to opentheImage Viewermenu, withtabsforflowcellinformation,FOVs(withtheoptiontohideoutlinesorlabels),imagelayers (withtheoptiontohidelayers),rendersettings(withtheoptiontohidechannels,adjustintensitycolorscaling,and modifyon-screencolormapping),andexportimages(seealsoExportDataonpage77).Withsoftwarev2.1,the ImageLayerstabhasbeenupdatedtoincludeopacityandvisibilitycontrolsforthecellsegmentationlayer. 32 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual ImageViewerPanel MAN-10162-11 [Page 33] Recommended Data Overlays and Interactivity using the Image Viewer Panel Figure 25: Overlay cell transcripts Figure 26: Overlay cell types (Leiden clustering module output) Figure 27: Overlay cell types (Cell Typing module output) Figure 28: Overlay cell niches (Neighborhood Analysis module output) FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 33 MAN-10162-11 CosMx SMI Data Analysis User Manual Image Viewer Panel [Page 34] Figure 29: (Top) Overlay cell types in Image Viewer and display UMAP in adjacent Pipeline Data panel. Toggle 'Enable Selection' on in Pipeline Data panel and use lasso tool to select cells of interest on UMAP, to visualize those cells in the tissue (bottom). Figure 30 shows the orientation of the tissue in the Image Viewer panel. Figure 30: Tissue orientation from the glass slide to the Image Viewer 34 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Image Viewer Panel MAN-10162-11 [Page 35] Annotation Editor Panel Figure 31: Annotation Editor panel View selected flow cells and FOVs (Figure 31). Click the icon to add flow cell or FOV attributes, which appear as new columns in the table. Click on a column's 3-dots menu to rename, delete, or sort by the column. Export the displayed FOV information and attributes by clicking the download icon. See more information in Manage Annotations on page 36. Data Viewer Panel Displays one or two data visualizations in separate panes (Figure 32). Click the icons in the panel header to toggle between a single or paired display. For each visualization, select the pipeline step to display from the dropdown menu. Customize the display by selecting from the available dropdown menus. Click the arrow (carat) to open additional customizable settings. Figure 32: Data Viewer panel FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 35 MAN-10162-11 CosMx SMI Data Analysis User Manual Annotation Editor Panel, Data Viewer Panel [Page 36] ManageAnnotations TheAnnotationEditordescribedhereannotatesdataatthelevelofFOVsorflowcells.Theseannotationscarry forwardintothestudymetadataintheCosMxSMI DataAnalysisSuiteforusewiththeCellTyping(InSituType)and DifferentialExpressionmodules.TheycanalsobedownloadedandusedoutsideoftheAtoMxplatform,e.g.as inputstoexternaldataanalysispipelines. Toannotatedataatthecelllevel,usetheSampleMetadatacustommodules(seeCustomModulesonpage41). ToopenManageAnnotations,selectManageAnnotationsfromthedropdowninSpatialDiscoveryview(referback toFigure15)orclickthepenciliconfromtheFlowCellspanel(referbacktoFigure20).TheImageViewerand AnnotationEditoropen(Figure33). Figure33:Manageannotationswindow ReviewthefeaturesoftheImageViewerinthesectionImageViewerPanelonpage32. IntheAnnotationEditor, viewavailableflowcellsandFOVs.ClickShowSelectedFOVstolistonlytheFOVs highlightedintheImageViewer.(UseCtrl+ ClicktohighlightmultipleFOVsofinterestortoggletheHighlight visibleFOVsbuttontoautomaticallyhighlightallFOVsshowingintheImageViewer.Seetheyellowhighlighted FOVsinFigure33.)Clickthe+icontoaddflowcellorFOVattributes,whichappearasadditionalcolumnsinthe table(referbacktoFigure31).Clickonacolumn's3-dotsmenutorename,delete,orsortbythecolumn.Once saved,columnnamescannotbechanged. Annotationscannotbeaddedwhileapipelineisrunning.Avoidusingspecialcharactersandspaces.Donotusethe UpdateSampleMetadatacustommoduletoaddFOV-levelannotationsifusingtheAnnotationEditorinthesame study. ExportthedisplayedFOVinformationandattributesbyclickingthedownloadicon. ClickSaveChangeswhendoneeditingannotationstoreturntotheDataAnalysissuite.Toexitwithoutsaving, clickthexinthetopright. Annotationsareappliedonlywithinthepresentstudy;theydonotcarryovertonewstudies. 36 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual ManageAnnotations MAN-10162-11 [Page 37] Run a Pipeline For Spatial Discovery studies, theSpatial Discovery pipeline (Figure 34) runs automatically upon study creation. This pipeline consists of the modules Initial Data, Quality Control, Normalization, Pathway Analysis, Gene Selection, PCA, UMAP, Neighbor Network: Expression Space, Leiden Clustering, Neighborhood Analysis, Novae, Identify Marker Genes, Spatial Discovery, and Differential Expression. (The pipeline pauses at the Differential Expression module to allow the parameters to be set by the user.) Read more about the modules in CosMx SMI Pipeline Modules on page 46. The modules and their order cannot be customized. Figure 34: Spatial Discovery data analysis pipeline. For classic configuration studies, to create and customize a pipeline, click the Pipeline icon in the Pipeline Run panel (refer back to Figure 16) to open the Run Pipeline window ( Figure 35). Enter a run name that begins with a letter, not a number (please note this new requirement in AtoMx v2.0 and later). Figure 35: Run Pipeline window, displaying the foundational pipeline for RNA data. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 37 MAN-10162-11 CosMx SMI Data Analysis User Manual Create and Run a Pipeline [Page 38] SelectorEditaDefinedPipeline Thedropdownmenudisplayspreviouslybuiltandsaveddataanalysispipelines,includingthefoundational pipelinesforRNA andProtein.ThesepipelinesconsistofthemodulesQualityControl(QC),Normalization,PCA, UMAP,CellTyping,NeighborNetwork:ExpressionSpace,LeidenClustering,NeighborhoodAnalysis,Identify MarkerGenes,and(forProtein)CellTypeCo-localization(Figure35).ReadmoreaboutthemodulesinCosMxSMI PipelineModulesonpage46. Afterselectingapipelinefromthedropdownmenu,clickthepencilicontoedititorthetrashicontodeleteit(not availableforthefoundationalpipelines). Thefoundationalpipelineisagoodplacetostartifyouarenewtospatialdataanalysisorarejustbeginningto exploreyourdataset.Itmaynotsuitalldatasetsandexperimentaldesignstrategies.Neitherthesoftwarenor thisusermanualintendto prescribethe"right"wayto analyzeyourdata,andanalysisshouldultimatelybe customizedtotheexperimentaldesignofyourstudies. Ifyoueditanexistingpipelineandchangethename,thesoftwareoverwritestheoriginalpipelinewiththeedits andnewname.Itdoesnot"SaveAs"acopyofthepipeline. 38 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CreateandRunaPipeline MAN-10162-11 [Page 39] Create a New Pipeline Click Create Pipeline to build a new data analysis pipeline. In the Create New Pipeline window (Figure 36), enter a pipeline name. Build your customized pipeline by dragging modules from the Toolbox into the gridded workspace. Figure 36: Create New Pipeline window Click and drag the grid to pan around the workspace, and use the mouse scroll wheel to zoom. Hover over or click on a module in the workspace to summon arrows to create connectors between modules ( Figure 37); click the arrow, then the target module to draw a connecting arrow. If the module dependency prerequisites do not allow you to connect two modules, a notification appears in the top right. Review the module dependency prerequisites listed for each module inCosMx SMI Pipeline Modules on page 46. Number icons indicate the order of operations. Hover over the info icon ⓘ to show information about the module, and click the gear icon to show its settings. Detailed descriptions of each module can be found inCosMx SMI Pipeline Modules on page 46. To remove a module or connector from the pipeline, click on it in the workspace, then click its red x. Figure 37: Arrows create links between modulesClick Save to save the pipeline (enabling you to close the window or click Backwithout losing your work). Once the pipeline is constructed, click Save & Create Run. Enter a name for the pipeline run. Back in the Pipeline Structure panel, click Run All to begin the run. Follow the progress of the pipeline run as describedon page 31. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 39 MAN-10162-11 CosMx SMI Data Analysis User Manual Create and Run a Pipeline [Page 40] Asinglestepmaybere-run,ifdesired(iftestingdifferentparametersorincaseofpipelineerror).Anoptionwillbe presentedtoRerunsteporRerunstepwithchildren(toalsorunthesubsequentsteps).Ineithercase,anew branchofthepipelinewillbecreated.If theboxforStartpipelinebranchexecutionischecked,thestepwill beginrunningimmediately. Pleasenote,are-runcustommodulecannotberunin interactivemode(seePipelineStructurePanelon page31). Figure38:Retrypipelinestepdialog HowtoRunDataAnalysisPipelinesThatRequireParameterInputs ThedataanalysismodulesCellTyping(CELESTA),CellTypeQC,Ligand-ReceptorAnalysis,andDifferential Expressionrequireparameterinputsbeforerunning,causingthedataanalysispipelineto pauseuntilthis informationisprovided.Ifyouneedtoseetheoutputfromupstreammodulestosettheseparameters,thereare twowaystoaccomplishthis.Thefirstmethodistosetupthepipelinewiththemodulesnumberedsuchthat modulesrequiringinputarelast.Manuallyrun(viathe"play"button)eachupstreammoduleuntilsatisfiedwithits output.Then,runthedownstreammodules. Thesecondmethodistorunshort,preliminarypipelinestohelpdeterminethenecessaryinputparametersforthe modules.Onceidentified,youcanconstructthefinaldataanalysispipelinewiththeappropriateparametersalready established. Forinstance,theCellTypeQCmoduleallowsyoutorename,merge,ordeletecelltypeclustersgeneratedbythe CellTyping(InSituType)module.Thismodulerequiresuserinput,suchasnewnames,orclusternumberstomerge ordelete,whichcanonlybedeterminedaftertheCellTyping(InSituType)modulecompletesandits output evaluated.Therefore,it is advisableto runtheupstreammodules(orseparatetestpipeline):InitialData,QC, Normalization,andCellTyping(InSituType).EvaluatetheresultstodeterminetheappropriateinputfortheCellType QCmodule-whichclusters,ifany,oughttoberenamed,merged,deleted,orsubclustered.Withthisinformation, youcanconfidentlyrunthedownstreammodules(orbuildthefinaldataanalysispipeline)withtheappropriateinput parameters. Whilenotrequiringparameterinputstorun,theNormalization,UMAP,andCellTyping(InSituType)moduleshave parametersthatshouldbereviewedbeforerunningtoensurethattheyfittheexperimentalandanalyticaldesign. Refertothearticle"TipswhenperformingCosMxSMI dataanalysiswithAtoMx SIP". 40 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CreateandRunaPipeline MAN-10162-11 [Page 41] CustomModules Custommodulesareoptionaltoolsavailableforadvanceduserswithsomecomputational/codingbackground.The SpatialDiscoverypipelinecannotbemodifiedwithcustommodules. Forthelatestcustommodules,pleasechecktheCosMxDataAnalysisModulespageonGitHub. Ofnote,the SampleMetadatacustommodulesenabledataannotationatthecelllevel,whichcanthenbeusedtosubsetdatain theCellTyping(InSituType)orDifferentialExpressionmodulesofananalysispipeline,orusedinanexternaldata analysispipeline.(InAtoMxv2.0andlater,annotationattheflowcellorFOVlevelcanbeachievedfollowing ManageAnnotationsonpage36). FromthePipelineRunpanel(referbacktoFigure16),clickthecustommodulesicon . TheCustomModules windowopens(Figure39).Editordeleteexistingcustommoduleswiththepencilandtrashicons,respectively. Figure39:CustomModuleswindowwithtwocustommodulesloaded Toaddanewcustommoduletothepipeline,clickAddModule.TheAddNewModulewindowopens(Figure40). Enterthename(required;notethatcharacters/\&%arenotallowed),description,andparametersforrunning. Uploadthescriptfile(s)whereindicated,andsettheentrypoint(themainexecutablefile). DesignatetheRversionandRAM,Maxruntime,andCPUcorespecificationsforthecustommodule.Usingmore than16coreswillcauseCPUthrottlingandmayleadtodiminishingreturns. In theArgumentssection,click+ to addtheargumentsorvariablesdefinedin thescriptthatyouwantto manipulateinthecustommodule.Settheirtype(Numerical,String,Boolean,Private,orFile),name(exactlymatch thevariableinthescript),displayname(howitshouldappearintheuserinterface),therangeofallowedvaluesfor thisargument,thedefaultvalue,andwhetherornottheargumentshouldberequiredinthemodule. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 41 MAN-10162-11 CosMxSMI DataAnalysisUserManual CustomModules [Page 42] Figure40:Addnewmodulewindow Privateargumentsarethosecontainingsensitiveinformation,suchasAWScredentials.Privateargumentswillnot appearintheuserinterfaceorlogfiles. Forafileargumenttype,acceptablefileformatsare.csv,.txt,.RData,.RDS,and.rda.Thefilemustbelocal(notin S3). InthePackagessection,click+ to inputpackageinformationincludingrepository(CRAN,Bioconductor,orR- Forge),package,andversion. Foranexampleofacustommodule'sinputparameters,seeExportDataonpage77. ClickSavetoexitandaddthedefinedmoduletotheCustomModuleslist,orexitwithoutsavingbyclickingthexin thetopright. NOTE:If receivinganerrorsuchas"invalidmultibytecharacteratline3",checktheRscriptforformatted quotationmarkswhicharenotacceptedbytheSMIDataAnalysissoftware.Quotationmarksmaybe""butnot “ ”. If acustommodulerunningininteractivemodeisstuckinpending,de-selectinteractivemodeandrunthe pipelineagain.Seeinteractivemodedescriptiononpage 31. 42 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CustomModules MAN-10162-11 [Page 43] Recommendationsfor AnalyzingCosMxWholeTranscriptome  (WTX) Data AcommonanalysispathforWTX dataisoutlinedbelow.RememberthatanalysisintheCosMxSMIDataAnalysis Suiteshouldbecustomizedtotheexperimentaldesignofyourstudies.Neitherthesoftwarenorthisusermanual intendtoprescribethe"right"waytoanalyzeyourdata.AdditionalresourcesforanalyzingWTX dataareonlineinthe CosMxAnalysisScratchSpace. 1. Evaluatethetissueimageforstaining,segmentation,andtissueintegrity.Ifneeded,considerresegmentingto findabetterfit ofcellboundariesforyourparticulartissue/sample(seeCellSegmentationonpage14).Blurry areasoftheimagemaybeindicativeoftissueliftingawayfromtheslide.ConsiderexcludingtheseFOVsfrom downstreamanalyses. 2. QualityControl l AfterrunningtheQCmoduleinAtoMx,usetheRNA QCPlotcustommoduletocalculatetheSignal-to-Noise Ratio(SNR).(Fromthemoduleoutput,openthefilePer_FOV_data_quality_metricsandlocatethecolumn Mean_transcripts_per_cell_per_FOV.Divideeachvaluebythenumberofgenesinthepanel(18,934forWTX) togivethemeantranscriptsperplexpercellperFOV.Dividethisvaluebythemean_negative_probe_per_ plex_per_cell_per_FOVcolumn,togivetheSNRatFOV-level.)EvaluatingallFOVtogether,lookforSNR>~2. KeepinmindthatSNRdependsonmanyfactorsincludingsamplequality(sampleischemictime,fixation method,blockage,sectionpreparation,sectionage,sampleprocessing,digestionconditions,etc.) l Atthecelllevel,lookfortranscriptspercell> ~200.If toomanycellsareflagged(30%ormore),consider reducingthisthreshold.Transcriptspercelldependsonmanyfactorsinstudydesignandsamplebiology. 3. Normalization l TotalCountsNormalizationis generallyrecommendedasit keepsthedataona linearscale,is easily interpretable,andisquicktorun. l Othertransformations(log1p,Pearson,sctransform)arepossibleandmayimprovevisualizationsinsome datasets.However,thePearsonmethodisveryresource-andtime-intensive,soit isonlyrecommendedfor smallerdatasets(usinglowerplexpanelsthanWTX). 4. GeneSelection l Selectionof3000highlyvariablegenes(HVGs)isrecommendedfordownstreamprocessing.Refertothe moduledescriptionforGeneSelection-RNAonpage51. 5. PCA l Calculate50principalcomponentsfromnormalizedcountsforselectedHVGs. 6. UMAP l Recommendedparameters(optimalparametersareproject-dependent;thesearesuggestedasastarting point): FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 43 MAN-10162-11 CosMxSMI DataAnalysisUserManual RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data [Page 44] o Minimumdistance=0.01;lowerminimumdistancegeneratesmoreclusters. o Spread=5or2;higherspreadyieldsmoreseparationofclusters. o Neighbors=30;keepbetween20-40;highervalueyieldsmoredistinctclusters. o Metric=cosine. o Usebetween15-50principalcomponents. o Suggestedresource:https://pair-code.github.io/understanding-umap/. 7. CellTyping(LeidenClusteringorInSituType) l LeidenClusteringgivesanideaoftheoverallstructureofthedataset.Adjustthe"Resolution" parameterif desired(highernumericalvaluegeneratesmore,smallerclusters). l ForCellTyping(InSituType),fully-orsemi-supervisedcelltypingisrecommended.Thismethodispreferableto LeidenClusteringifyouhaveagoodreferenceprofile,ideallyderivedfromCosMxSMI data(seetheCosMx SMI-basedprofilesathttps://github.com/Nanostring-Biostats/CosMx-Cell-ProfilesorscRNA-seq-basedprofiles at https://github.com/Nanostring-Biostats/CellProfileLibrary. If usingscRNA-seq-basedprofiles,correct platformeffectsbyselectingRescale=TRUEintheInSituTypemodule). l AsubsetofthetotaldatasetcanbedesignatedforinputintotheCellTyping(InSituType)module(seeCell Typing(InSituType)Inputparameters:onpage56). Thiscanhelpdataanalysispipelinesbymakingthe workingdatasetabitsmaller.CellsthatareexcludedfromCellTypinganalysiswillhavethelabel"NA"inthe result. 8. Neighborhood(Niche)Analysis l NeighborhoodAnalysisoperatesonthecell-typelevel,notthetranscriptlevel(i.e.it isbasedoncell-typing calls,notgeneexpression).Agoodstartingpointisoften7 nicheswith50-micronradius.Thisanalysisis typicallyquicktorunsoit isrecommendedtoiteratefromthestartingpointtofindtheparametersthatsuit yourexperimentaldesign. If runningDifferentialExpression(DE)analysisin largeWTXstudies,consideraskingcelltype-specificDE questions,subsetting,andrunningseparateDEanalysesbycelltype,astheyareoftenthemostinterestingand directwaytointerpretDEresultsconcerningindividualgenes.TosubsetdataforinputintotheDEmodule,seethe DifferentialExpressionInputParametersonpage 67. ForDEquestionsthatrequireusinga fulldataset,timeto fit regressionmodelsdependsonseveralfactors, includingthenumberofcells,complexityofmodelformulaandnumberofincludedcovariates,anddistribution family(normaldistribution(gaussian)/ negativebinomial(nbinom2)).IfrunningDEwithmillionsofcellsatonceis necessary,therecouldbeconsiderabletimespentonmodelfittingpergene.Forcontext,asmallbenchmarkingof running10millioncellsatonce,for1thousandgenes,withasimplemodelformulausingnegativebinomialmixed modeldesignedtoidentifygenesDEamongspatialnichesacrossallcells(below)took11hoursforfullmodule completion,oraround0.66minutespergene: ~RankNorm(otherct_expr) + (1|Run_Tissue_name) + spatialClusteringAssignments+ offset(log(nCount_RNA)), Forcomparison,thesame10millioncellsrunusingagaussian/ normaldistributionwithnormalizeddata,and replacingtherandomeffectwithafixedeffect(below)tookslightlylongerat13.5hours,or0.81minutespergene: 44 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data MAN-10162-11 [Page 45] ~RankNorm(otherct_expr)+ Run_Tissue_name+ spatialClusteringAssignments, Ingeneral,modelswithrandomeffectswilltakelongerforfittingthanmodelswithfixedeffects.Negativebinomial modelfittingisslowerthanpoissonmodelfitting,whichinturnisslowerthangaussianmodelfittingforalmostall packageimplementationsoftheseregressionroutines. Anexceptiontothisgeneralguidanceisthatnegativebinomial(andpoisson)mixedeffectsmodelsarefitinAtoMx SIPusingthe`nebula`R package(Heet.al.2021), whichis exceptionallyfast,andis thereasonforfaster benchmarkingdescribedabove. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 45 MAN-10162-11 CosMxSMI DataAnalysisUserManual RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data [Page 46] CosMxSMIPipelineModules CosMxSMI dataanalysispipelinesarecomprisedof differentmodulesto customizeanalysisaccordingto the experimentaldesign.Modulesdonotnecessarilyneedto berunin theorderlistedin thefollowingpages. Prerequisitesaredefinedineachmodule'sdescription.SeeHowto RunDataAnalysisPipelinesThatRequire ParameterInputsonpage40forguidancetotailorpipelinestoyourneeds.SeeAppendixI: LiteratureReferences onpage86forapplicableliteraturereferences. PleasenotethatanalysisofcustomproteintargetsreliesontheselectionofthecorrectcustomSPKfileatthe timeofrunset-upontheCosMxSMIinstrument.Pleasecontactsupport.spatial@ bruker.comforadditionalsupport. Analysisin theCosMxSMIDataAnalysisSuiteshouldbecustomizedto theexperimentaldesignof your studies.Neitherthesoftwarenorthisusermanualintendtoprescribethe"right"waytoanalyzeyourdata. InitialData Prerequisitemodules:None Moduledescription: CosMxSMI dataisloadedintothepipelineorchestratorfordownstreamanalysis. Outputvisualizations:Studystatisticstable(seeCosMxSMIDataVisualizationsonpage70).SeeTable4 for studystatisticscalculations. Value Calculation(FOVlevel) Calculation(Flowcelllevel) Meantranscriptper cell Averageofalltranscript-per-cellvaluesforthe FOV Averageofalltranscript-per-cellvalues fortheflowcell Meanuniquegenes percell Averageofallunique-genes-per-cellvaluesforthe FOV Averageofallunique-genes-per-cell valuesfortheflowcell 10thpercentile transcriptpercell 10thpercentileofalltranscript-per-cellvaluesfor theFOV 10thpercentileofalltranscript-per-cell valuesfortheflowcell 90thpercentile transcriptpercell 90thpercentileofalltranscript-per-cellvaluesfor theFOV 90thpercentileofalltranscript-per-cell valuesfortheflowcell Meannegprobe countspercell SumofnegativeprobecountsfortheFOV/Total numberofcellsintheFOV Per-FOVmeannegprobecounts averagedacrosstheflowcell Table4:Studystatisticscalculations. 46 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 47] QualityControl-RNA Prerequisitemodules:InitialData Moduledescription:Thismoduleflagsunreliablenegativeprobes,cells,FOVs,andtargetgenes,asdefined below.Flaggeditemsarenotremovedfromdownstreamanalysis. l NegativeprobeQCflagsnegativecontrolprobesthatappeartobehavelikeoutliersinthetissue.Thishelps preventtissue-specificorsample-specificbackgroundeffectsfromimpactingotherQCmetricsthatrelyonthe negativecontrolprobevalues.Grubb'stestisused,andnegativeprobesthataredesignatedoutliers(according top-valueparametersetbytheuser)areflagged. l CellQCflagscellswithloworspurioussignalbasedonthenumberoftargetsdetected(moreisbetter),fraction ofprobeswhicharenegativecontrols(fewerisbetter),uniformityofcountdistribution(highercomplexityis preferred),andcellsize(thetoppercentilemayneedtoberemovedasaQCofsegmentation). l FOVQCidentifiesFOVsthathavegenerallylowexpression.Twoapproachesareavailable:theMeanmethod flagsFOVswherethetotalcountpercellaveragesbelowathreshold.TheQuantilemethodflagsFOVswherea highlyexpressedgeneisbelowbackground.Seedetails,below. l TargetlevelQCflagstargetsthatappeartobebelowbackgroundacrossthedataset,basedonprobedistribution relativetonegativecontrolprobes. Custompipelinemodulename(optional):Assignauniquenametoyourpipelinemodule,whichwillbeusedfor identifyingpipelinemoduleresultsindownstreamanalysis.Ensurethenameisuniquewithinyourstudyanddoes notexceed100characters.Onlyletters,numbers,underscores(_),hyphens(-),colons(:),andparenthesesare permitted.Ifleftblank,adefaultnamewillbeassigned. Inputparameters: l NegativeprobeQC: l OutlierP-valuecutoff:(default:0.01;range:0-1)toflagoutliernegativeprobes. l RemoveQCflaggednegativecontrolprobes(yes/no).(Itisgenerallyrecommendedtobeawareof,butnot remove,flaggeditemsfromthedataset.) l CellQC: l Minimalcountspercell:recommend50or100for6Kpanel;20for1000-plexpanel;5for100-plexpanel;must be>1.IncreasethethresholdtomakeQCmorestringent. l Proportionofnegativecounts:flagcellswhere>10%(0.1,thedefaultvalue)ofthecountspercellarenegative probes.DecreasethethresholdtomakeQCmorestringent. l Countdistribution:thismetricprovidesameasureoftranscriptioncomplexity.Itcalculatesthe(totalcounts)/ (numberofdetectedgenes).Ifsetto1(default),themodulewillflagcellsinwhichthenumberoftotalcounts equalsthenumberofdetectedgenes(i.e.eachdetectedgenehas1count).Ifsetto100,themodulewillflag cellsinwhichthenumberoftotalcountsislessthan100xthenumberofdetectedgenes.Theallowablerange is1-200butitisrecommendedtoleavethevaluesetat1. l Areaoutlier:Grubb'stestp-value(default:0.01,range0-1) to flagoutliercellsbasedoncellarea.If FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 47 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 48] megakaryocytesorothercelltypesknowntohavelargecellareaareinyoursample,keepthep-valuelowto maketheoutlierdesignationstringent. l FOV QC: l FOV QCmethod:mean(default)orquantile.Themeanmethodisbasedontranscriptlevelsasdefinedin FOV countcutoff, below.Thequantilemethodis basedontherelationshipbetweena highsignaland background,asdefinedinFOVQC quantile,below.Themethodofchoiceisuptotheuser. l FOVcountcutoff:onlyappliestoMeanmethod.FlagsFOVthathaveanaveragetranscripts-per-cellbelowthis value(default:100;range≥ 0). l FOVQC quantileandFOVquantiletonegativecutoff:onlyappliestoQuantilemethod.FlagsFOVinwhicha highsignal(e.g.,90thpercentilegenecount)isnotsufficientlyabovebackground(themedianofthenegative probes)."Sufficientlyabove"isdefinedbytheFOVquantileto negativecutoff.Quantiledefault:0.9(90th percentile),range0-1.Cutoffdefault:0,mustbe≥ 0. l TargetlevelQC: l Negativecontrolprobequantilecutoff:setthethresholdatwhichtoflagprobes.Avalueof0.5(default)will flagprobeswithlowertotalcountsthanthemedian(50thpercentile)ofthenegativecontrolprobes'counts. Range0-1. l Detection:setthep-valueatwhichatargetisdeemedtobeabovebackground(0.01default,range0-1).Itis recommendedtosetthevalueto0,resultinginnoflagsappliedtothetargets.(Ifyouwishtoadjustthis parameter,beawarethatasmallerp-valuerequiresthatthetargetishigherabovebackground.Alargerp-value allowsthetargettobeclosertobackground.) Outputvisualizations:QCmetricstable(seeTable5),XYplot,heatmap,boxplot,violinplot,andhistogram.See CosMxSMIDataVisualizationsonpage70. ExploreyourQCdataset: l LookatQCdataoverlaidonthetissueimagebyincludingtheImageViewerpanelinyourDataAnalysisSuite view,selectingthebuttonCellstooverlaycelldata,thenselectingExpressionandStep: QualityControlfrom thedropdownmenus.Fromthenextdropdownmenu,selectagenewithknownorexpectedexpressioninthe tissue.Doesit passa'sanitycheck'intermsofitsexpressioninthisregionofthesample?Dovisualization markers'expressionmatchthetissuemorphology? 48 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 49] NegativeProbeQCSummary OutlierTest % Passis% ofNegativeProbeswhichthatpassGrubb'stest(i.e.arenotoutliers) basedonthep-valuethresholdsetinmoduleparameters. CellQCSummary MinimumCounts % Passis % of cellsabovetheminimumcountspercellthresholdsetin module parameters. Proportionof Negatives %Passis%ofcellswithnegativeprobecountproportionlessthanthethresholdsetin moduleparameters. Complexity(count distribution) %Passis%ofcellswithratiooftotalcountstothenumberofdetectedgenesgreater thanthevaluesetinmoduleparameters. Area %Passis%ofcellswithcellareawhichisnotdesignatedasanoutlieraccordingtothe thresholdsetinmoduleparameters. FOV QCSummary FOVsFlagged %Passis%ofFOVsthatwerenotflaggedinFOV QC,accordingtomoduleparameters setbyuser. CellsFlaggedacross FOVs %Passis%ofcellsbelongingtoFOVsthatpassedFOV QC. TargetQC Summary NegativeControl % Passis % of targetswherethetarget'stotalcountsis greaterthanthedefined quantileofthenegativecontrolprobestotalcount,setinmoduleparameters. Detectionover Background %Passis%oftargetsabovebackground,accordingtop-valuethresholdsetinmodule parameters. Table5:RNAQC metricsandpassdefinitions Readmoreaboutfactorscontributingtodataqualityinthe"Top3TipsforSuccessfulCosMxSMISingleCellSpatial Runsat1000-Plex". FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 49 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 50] QualityControl-Protein Prerequisitemodules:InitialData Moduledescription:Thismoduleflagsunreliablecellsbasedonsegmentedcellarea,negativeprobeexpression, andhigh/lowtargetexpression.CellswithoutlierGrubb'stestp-values<0.01forsegmentedareaareflagged.Cells withmeannegativeprobevaluesbelowthelowerthresholdorabovetheupperthreshold(asdefinedininput parameters;seebelow)areflagged.Cellswithoverlyhighorlowtargetexpression(asdefinedininputparameters; seebelow)areflagged.Flaggedcellsarenotremovedfromthedataset. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l Negativeproberange:flagcellswithnegativeprobemeanbelowthelowerthreshold(default2)orabovethe upperthreshold(default50). l Highexpressionproteins'proportionandthreshold:flagcellswhere50%ormore(0.5,default;range0-1)of proteinsareinthe90thpercentileorhigher(0.9,default;range0-1).TomakeQCmorestringent,decrease percentageand/ordecreasethreshold. l Lowexpressionproteins'numberandthreshold:flagcellswherefewerthan3(default;range0-NwhereNis totalnumberofproteinsinpanel)proteinsareinthe50thpercentileorhigher(0.5,default;range0-1).Tomake QCmorestringent,increasenumberand/orincreasethreshold. Outputvisualizations:QCmetricstable(seeTable6);XYplot,heatmap,boxplot,violinplot,andhistogram.See CosMxSMIDataVisualizationsonpage70. ExploreyourQCdataset: l LookatQC'ddataoverlaidonthetissueimage(includetheImageViewerpanelinyourDataAnalysisSuite view,selectCellstooverlaycelldata,thenExpressionandStep: QualityControlfromthedropdownmenus. Fromthenextdropdownmenu,selectagenewithknownorexpectedexpressioninthetissue).Doesitpassa 'sanitycheck'intermsofitsexpressioninthisregionofthesample?Dovisualizationmarkers'expressionmatch thetissuemorphology? CellQC Summary Negativeprobes %Passis%ofcellswithnegativeprobemeanwithintherangesetinmodule parameters Lowexpression % Passis% ofcellswithenoughproteinexpressiontomeetcriteriasetin moduleparameters Highexpression %Passis%ofcellswithfewenoughhigh-expressingproteinstomeetthe criteriasetinmoduleparameters Area % Passis% ofcellswithcellareawhichisnotdesignatedasanoutlierin Grubb'soutliertest(p<0.01) Table6:ProteinQC metricsandpassdefinitions 50 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 51] GeneSelection-RNA Prerequisitemodules:QualityControl Moduledescription:GeneSelectiondefinesasetofgenesfordownstreamanalysisofRNAdatasets.Itdoesnot removegenesfromthedataset,butenablesafocusedanalysisincertaindownstreammodules.Genescanbe selectedbytheuser(byuploadingalist)orautomaticallybasedonthehighlyvariablegenescalculatedbySeurat's FindVariableFeaturesmethod.Additionalguidancecanbeprovidedtothemodulebynaminggenestoincludeor excludeintheselectedset.TheresultinggenesetcanbeusedasinputforNormalization,CellTyping(InSituType), PCA,IdentifyMarkerGenes,andSpatialExpressionAnalysismodules. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l GeneListName(createanameforthissubsetofgenes;onlyletters,numbers,underscoresandperiodsare allowed,andthenamecannotstartwithanumber) l SelectIncludeHighlyVariableGenesto subsetbasedontheSeurat-calculatedhighlyvariablegenelist. OptionallyadjusttheSmoothingSensitivity(default0.3),whichtunesSeuratcallingofhighlyvariablegenes. Increasingthevalueresultsinmoresmoothing;itwillfavorgeneswithconsistentlyhighvarianceacrossbroader expressionranges.Decreasingthevalueresultsinlesssmoothing;themodelmaycatchmorehighlyvariable geneswithinspecific,narrowexpressionwindows,butmayincludegenesthatarevariableduetonoiserather thanbiologicalsignal.Selectthenumberof highlyvariablegenesto includeand(optionally)selectgenesto requireorexcludefromthedropdownmenus. l SelectIncludeUserSelectedGenestosubsetbasedonauser-providedgenelist.Downloadthefullgenelist byclickingthedownloadicon,editasneeded,anduploadbacktothemodulebyclickingUploadCustomGene List. Outputvisualizations:No visualizationsavailable,but the resultinggenelist canbe usedas inputfor Normalization,CellTyping(InSituType),PCA,IdentifyMarkerGenes,andSpatialExpressionAnalysismodules (selecttheappropriateGeneListnamewhensettingtheinputparametersforthedownstreammodule;see additionalnotesinthosemodules'descriptions,below).Theresultinggenelistcanalsobedownloadedafterthe moduleruns. Normalization-RNA Prerequisitemodules:QualityControl Moduledescription:Generatesnormalizedexpressiondatafromcounts.RNA normalizationadjustsforlibrarysize factorstoensurethatcell-specifictotaltranscriptabundanceanddistributionofcounts(whichmayvarybetween someFOVsandbetweensamples)doesnotinfluencedownstreamvisualizationanddataanalysis.Three normalizationmethodsareavailable: l Totalcountnormalization(default):Genecountinacell/totalcountsinthecell. l SeuratusesSeurat::NormalizeData()with"LogNormalize"defaultsetting.(Featurecountsforeachcellaredivided bythetotalcountsforthatcellandmultipliedbythescalefactor.Thisisthennatural-logtransformedusing FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 51 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 52] log1p.) See reference inAppendix I: Literature References on page 86. l Pearson residuals normalization is based on the estimated mean and variance: (gene count in a cell - mean gene count in the cell) / standard deviation of gene counts in the cell. An overdispersion factor can be specified in the module. Overdispersion is variance greater than what is predicted by the model. Learn more from tutorials such as "Regression with Count Data: Poisson and Negative Binomial". See reference in Appendix I:  Literature References on page 86. Custom pipeline module name (optional): Used to identify module results in downstream analysis; see full description on page 47. Input parameters: Normalization method (select Seurat, Pearson residuals or Total counts). For Pearson residuals method, set overdispersion value (see above): default 100; must be ≥ 0. Gene List Name (if the Gene Selection module has been run, the resulting gene list is available to select for input to the Normalization module. The Normalization module will prduce a subset of the expression matrix based on the subsetted genes). Output visualizations: XY plot (Figure 41), heatmap, box plot, violin plot, histogram. See CosMx SMI Data Visualizationson page 70. Explore your normalized dataset: l Evaluate the data for normalization bias: overlay normalized data on the tissue image by including theImage Viewer panel in your Data Analysis Suite view, selecting the button Cells to overlay cell data, then Expression and Step: Normalization from the dropdown menus. From the next dropdown menu, select a housekeeping gene or other target expected to have even expression throughout the tissue. Is expression bias observed across FOVs? Normalized data is used as the input to generate heatmaps, violin plots, box plots, PCA, UMAP, and to visualize counts on tissue. It is not used as the input to differential expression or other modules that include a normalization function. Figure 41: Normalized data in XY plot, colored by expression, helps evaluate normalization bias. 52 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual CosMx SMI Pipeline Modules MAN-10162-11 [Page 53] Normalization - Protein Prerequisite modules:Quality Control (Protein) Module description: Generates normalized (background- subtracted) protein data from mean fluorescence intensity (MFI) values. Protein normalization is based on the concepts of: l Total intensity scaling, to reduce the effect of technical artifacts such as shading or edge effects. l arcsinh transformation, to improve visualization clarity and stabilize variance across the sample. Total intensity scaling: Since protein data involves continuous intensities rather than counts, the total intensity for a cell refers to the sum of (average intensity for each protein) in the cell. This accounts for technical artifacts where certain parts of the image are brighter or dimmer. Total intensity scaling is essentially converting from an absolute intensity to a proportion. This proportion then gets scaled back up by the average (across cells) total intensity. This is similar to RNA normalization in which counts for a given gene in a cell are divided by total counts across all genes in a cell. The arcsinh transformation is used to stabilize the variance, so that observations with higher intensity don’t also have higher variance in that intensity. Arcsinh is a standard data transformation in modern flow cytometry comprised of linear scaling for values close to zero and logarithmic scaling for larger (negative and positive) values, with the transition between scales smoothed out. It brings protein data to a more "normal" distribution. Read more inFinak, Perez,Weng et al (2010)and Folcarelliand van Staveren et al (2021). Custom pipeline module name (optional): Used to identify module results in downstream analysis; see full description on page 47. Input parameters: Total intensity normalization (yes/no); Transformation (yes/no). Default: 'yes' for all parameters. Output visualizations: XY plot ( Figure 41), heatmap, box plot, violin plot, histogram. See CosMx SMI Data Visualizationson page 70. Explore your normalized dataset: l Evaluate the data for normalization bias: overlay normalized data on the tissue image (include theImage Viewer panel in your Data Analysis Suite view, select Cells to overlay cell data, then Expression and Step: Normalization from the dropdown menus). From the next dropdown menu, select a housekeeping gene or other target expected to have even expression throughout the tissue. Is expression bias observed across FOVs? Normalized data is used as the input to generate heatmaps, violin plots, box plots, PCA, UMAP, and to visualize counts on tissue. It is not used as the input to differential expression or other modules that include a normalization function. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 53 MAN-10162-11 CosMx SMI Data Analysis User Manual CosMx SMI Pipeline Modules [Page 54] PrincipalComponentAnalysis(PCA)-RNA orProtein Prerequisitemodules:Normalization Moduledescription:PCAprovidesanorthogonallyconstraineddimensionalreductionanalysisofthecountdata acrossallcellsin thedataset.It producesoutputvalues(principalcomponents,orPCs)representingaxesof variationwithinthedata,whichareacombinedvalueofweightedexpressioninagivencell.PCsareorderedby decreasingvariationexplainedinthedata.Thesecanbeusedtobetterunderstandvariationwithinadataset,butare mostcommonlyusedinsingle-cellanalysisasaninputfortheUMAPanalysis. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters:Numberof principalcomponentscalculated(default50;mustbeaninteger≥ 3),GeneList Name(iftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailabletoselectforinputtothePCA module.PleasenotethattheNormalizationmoduleautomaticallypassesitsresultsto thePCAmodule,soif Normalizationwasrunonasubsetofgenes,andPCAfollows,itwillalsorunonthatsubsetofgenesregardlessof theselectionmadein"GeneListName"inthePCA module.IfNormalizationwasperformedonallgenes,andthe GeneSelectionmodulefollowed,thenthePCAmodulecanuseeitherthesubsettedgenesfromtheGene Selectionmoduleorallgenes,assetby'GeneListName'inthePCA module). Outputvisualizations:PCAplot.SeeCosMxSMIDataVisualizationsonpage70. ExploreyourPCA dataset:Whileclusteringmaybescrutinizedhere,generally,thePCA datasetfeedsdirectly intoUMAPandclusteringisevaluatedthere. UMAP-RNAorProtein Prerequisitemodules:PCA Moduledescription:UMAP(UniformManifoldApproximationandProjectionfordimensionreduction)providesa visualizationofhigh-plexcomplexdatasetsin2-dimensionalspaceusinganon-linearapproachtoestimaterelated groupsofcellsorfeatures.Thismethodisacommonwayofvisualizingsingle-celldatatoidentifyclustersofrelated cellswhichmaybefromthesamelineage. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l Numberofneighbors:thenumberofneighboringpointsusedinlocalapproximationsofmanifoldstructure (default30,range5-50).Increasethevaluetopreserveglobalstructureatthelossofdetailedlocalstructure. l Minimumdistance:controlshowtightlytheembeddingcompressespointstogether(default0.01,range0.001- 0.5).Increasethevaluetomoreevenlydistributeembeddedpoints;decreasethevaluetoallowthealgorithmto optimizemoreaccuratelywithregardtolocalstructure. l Spread:theeffectivescaleofembeddedpoints.Incombinationwithminimumdistance(above),thisparameter determineshowclusteredtheembeddedpointsare(default5,range0.5-10).Increasevaluetoincreasespread andreduceclustering. 54 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 55] l Distance metric: select Cosine (default), Euclidean, Manhattan, or Hamming, to determine the metric used to measure distance in the input space. Read more about distance metrics in machine learning inEhsaniand Drabløs(2020). l Data fraction: set a % of PCA data to use as input. A value of 1 uses 100% of the data and results in a standard UMAP. A value of 0.25 uses 25% of the data to enable a UMAP projection with less computational burden (i.e., faster). Default 0.25, range 0.01-1. Output visualizations: UMAP plot (displaying data from all FOVs and flow cells in the study). SeeCosMx SMI Data Visualizationson page 70. Explore the UMAP dataset: l Evaluate the UMAP plot: include thePipeline Structure panel and Pipeline Data panel in your Data Analysis Suite view, then select theUMAP module in the Pipeline Structure panel. Select different color schemes by clicking the arrow (carat) to expand customization options. Try coloring the UMAP plot by tissue annotation, expression of individual targets, cell type, or total cell transcript counts, to evaluate clusters. l Compare the UMAP plot to the XY plot to see where clusters of cells defined in the UMAP exist in the tissue: include the Data Viewer panel in the Data Analysis Suite view, and select step: UMAP for one visualization and step: Normalization for the other (Figure 42). Display the Normalization plot as a scatter plot and color code by cell type. Synchronize the color coding scheme to allow comparison of certain cell types in both visualizations. l Examine patterns of co-expression between cell markers or targets with known behavior and the targets of interest to your experimental design. Figure 42: Use the Data Viewer panel to evaluate UMAP data (top) compared to expression data displayed in an XY plot (bottom). FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 55 MAN-10162-11 CosMx SMI Data Analysis User Manual CosMx SMI Pipeline Modules [Page 56] CellTyping(InSituType)-RNA Prerequisitemodules:QC,Normalization,orPCA Moduledescription:ThismoduleusestheInSituTypealgorithmtoidentifyandsubsetdatabasedoncelltypes (seeDanaheretal.2022).ReadmoreaboutthismodulefromtheInSituTypeFAQshostedonGithubandCell Typing: AdvancedStrategiesarticleintheCosMxAnalysisScratchSpace.Threemethodsareavailable: l Supervisedclustering:Celltypeassignmentsaremadebasedonareferencematrixspecifyingtheaverage expressionprofileofeachcelltype.Useoneoftheprovidedreferencematricesorgenerateyourown(see instructionsingrayboxonpage 46).Aqualityreferencematrixwill: • Includeallthecelltypespresentinyourtissue.Granularcelltypesarepreferred(e.g.separate profilesfor“ dendriticcell” ,“ M1macrophage” ,“ M2macrophage” ,etc),butbroadcelltypesare accepted(e.g.asingle“ myeloid” profile). • IncludemostofthegenesfromyourCosMxSMIpanel. • Comefromarobustdataset.Aprofilebasedonjust20cellsfromararecellpopulationwillbe inaccurate. l Unsupervisedclustering:Celltypeclustersaredeterminedbythesoftwarewithoutareferencematrixinput; thencelltypelabelscanbeassignedtoclustersbasedonmarkerexpressionorothercharacteristic.Thesingle argumentinunsupervisedclusteringisthenumberofclusterstofit.EvaluatetheUMAPtoinferareasonable numberofclusters;orrelyonthedefaultvalueof10clusterswhichworkswellinmostsettings.Afterevaluating yourclusteringresults,youhavetheoptiontomergeclosely-relatedclustersorbreakupoverlylargeclusters (seemoduleCellTypeQC-RNAonpage60).Notethatthenumberofclustersshouldbe> 1andmustbean integer,notarange. l Semi-supervisedclustering:A celltypereferencematrixisprovidedto thesoftware,butnotallcellsinthe datasetmustfit intoa referencecelltypecategory.Theusermayaddtheirowncelltypedefinitions.The algorithminitiallyfitscellsusingthereferencematrixandcellsthatdonotfitaresplitintonclusters(seeinput parameters,below). Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l ColumnforSubset:Ifdesired,selectasubsetofcellstoincludeintheCellTypinganalysis,basedonthecolumns oftheSampleMetadata.csvfile(obtainedusingtheGetSampleMetadatacustommodule;seeCustomModules onpage41andmodule-specificinstructionswiththecustomscriptinGithub).Onlycolumnscontainingboolean dataareeligible(true/false,1/0,orpass/fail). Forexample,toincludeonlycellsthatpassedQC,buildapipelinewithCellTyping(InSituType)downstreamof theQualityControlmodule.AftertheQCmoduleexecutes,theInSituTypemodulefield"ColumnforSubset"will populatewiththeavailableQCparameters.Select"qcCellsPassed"fromthisdropdownmenuto runthe InSituTypemoduleonlyoncellsthatpassedQC.RefertoTable7forColumnforSubsetdefinitions. Itisalsopossibletosubsetbasedonmultipleparameters:followthecustommoduleinstructions(inGithub)to gettheSampleMetadata.csvfile,addanewcolumnsuchas"FOV1-10andpassedQC",andfillinbooleandata foreachrow(true/false).UsetheUpdateSampleMetadatacustommoduletoapplythemetadatatothestudyand 56 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 57] enableselectionofthisparameterinthefield"ColumnforSubset".CellsthatareexcludedfromCellTyping analysiswillhavethelabel"NA"intheresult. ColumnName Definition qcFlagsCellCountsCapturescellswithcountspercellgreaterthantheminimumthresholdsetinQC module parameters qcFlagsCellPropNegCapturescellswithanacceptableproportionofnegativeprobecountspercell,assetinQC moduleparameters qcFlagsCellComplexCapturescellsthatexceedtheminimumcountdistribution(totalcounts/ numberof detectedgenes)assetinQCmoduleparameters qcFlagsCellAreaCapturescellsthatarenotoutliersinareaassetinQCmoduleparameters qcCellsFlagged Capturescellsthatfailanyofthesemetrics qcCellsPassed Capturescellsthatpassallofthesemetrics Table7:InSituType'ColumnforSubset'Definitions l GeneListName:IftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailabletoselectforinput totheCellTyping(InSituType)module.If 'Allgenes'isselected,thismodulewillrunonallgenesevenif the upstreamNormalizationmodulewasrunonaGeneSelectionsubset. l SelectfromSupervised,Unsupervised,orSemi-supervisedClusteringandinputtheBasicParameters: l IfSupervised,uploadareferencematrixin.csvor.RDataformat(withgenesinrows,celltypesin columns,andexpressionvaluesfillingthematrix;maxfilesize100MB).Ifuploadingan.RDatafile, theunderlyingvariablemustbecalledprofile_matrix.Valuesshouldbeuntransformedlinearscale, startingfrom0.Scalingofcolumnsdoesnotmatter.Seegraybox,below,forinstructionstoobtain areferencematrix.Thedialog"AwaitingInput"indicatesthereferencematrixisnotyetsuccessfully uploaded. l IfUnsupervised,selectnumberofclusterstogenerate(recommend10-20). l IfSemi-supervised,setthenumberofclusterstowhichthealgorithmwillassigncellsthatdonotfit thereferencematrix,anduploadareferencematrix(seegraybox,below).Thedialog"Awaiting Input"indicatesthereferencematrixisnotyetsuccessfullyuploaded. l Checktheboxtoincludethesegmentationmarkersignalinthecalculation. Todownloadadefinedreferencematrix,refertotheCosMxSMI-basedprofilesorscRNA-seq-basedprofiles hostedonGithub.Downloadtheappropriate.RDatafile.(IfusingscRNA-seq-basedprofiles,correctplatform effectsbyselectingRescale=TRUEintheInSituTypemodule;moredetailsbelow.) Alternatively,deriveyourownmatrixfromanappropriatesingle-cellRNAseq(scRNA-seq)dataset.Usea definedreferencematrixasa templatefileto producea .csvor.RDatafilethatwillberecognizedbythe software.Ifuploadingan.RDatafile,theunderlyingvariablemustbecalledprofile_matrix.(Itisnotneccessary toscalethevaluestomatchthetemplatefile,aslongasallmatrixdataisfromthesamescRNA-seqdataset.If combiningscRNA-seqdatasetsto createonematrix,it is importantto scalebetweendatasets).More informationisavailableattheCosMxCellProfilesScratchSpacearticle. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 57 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 58] l ReviewtheAdvancedParameterstab.ByleavingtheseparameterssettoFALSE,thesoftwarewillusethe referenceprofileas-is.BysettinganyoftheseparameterstoTRUE,thesoftwarewillmakeadjustmentstothe referenceprofileaccordingtoyourselections(moredetailandadecisionflowchartareavailableatInSituType FAQshostedonGithub. l Rescale(True/False):Anchorcells(high-confidencecelltypecalls)areidentified,thenusedto estimategene-by-geneplatformeffects.Theseeffectsarethenusedto rescalethereference profilestothespaceoftheCosMxdata.Thisisamorecautiousmethodofupdatingyourreference profiles. l Refit(True/False):Anchorcellsareidentifiedforeachcelltype,thenusedtore-estimatethecell type'sexpressionprofile,whollyreplacingtheoriginalprofile.Thisisamoreaggressivemethodfor updatingyourreferenceprofiles—it canmoreaccuratelycalibratethereferenceprofiles,butit is alsomorelikelytofail.Refittingtendstoperformbetterwhenrescalingisalsoselected. l RefineAnchors(True/False):If InsituTypefailsto discoverenoughanchorcellsto performthe calibration,considersettingthisparametertoTRUEandreducingtheMinAnchorLogLikelihood Ratio(default0,03,range0.001-0.1)andMinAnchorCosine(default0.1,range0-0.5),tolowerthe thresholdforanchorcellselection.Thedefaultswereoptimizedfor1K-plexdatasothisadjustment isoftenneededforhigher-plexstudies. Outputvisualizations:Heatmapofmarkergenes,flightpathplot,XYplotandUMAP(coloredbytheresultsofthe CellTypingmodule).Toaccesstheflightpathplot,clicktheimageicon onthesuccessfully-executedCellTyping moduleinthePipelineStructurepanel.SeeCosMxSMIDataVisualizationsonpage70. ExploreyourCellTypingdataset:Oncethecelltypeclustersareprojected,evaluateeachcluster'sspatial distribution,expressionprofile,andimmunofluorescencevalues.Todoso,includethePipelineStructurepanel, PipelineDatapanel,andImageViewerpanelinyourDataAnalysisSuiteview,thenselecttheCellTypingmodulein thePipelineStructurepanel.YoumayalsooverlaycelltypesonthetissueintheImageViewerpanel:selectthe flowcellandFOV(s)todisplayfromthedropdownmenus;selectCellstooverlaycelldataandStep:CellTyping (InSituType).Scrutinizethecelltypingresultsbyasking: l Shouldanycelltypesbemerged,sub-clusteredordeleted?LookattheoutputUMAPplotinthePipelineData panel(colorby:celltype).CelltypesthatoccupydisparateclustersontheUMAParecandidatesforsplittinginto sub-clusters.Lookattheflightpathplot(downloadimagefrommoduleinthePipelineStructurepanel)forclusters withlotsofcellsspreadbetweencentroids,indicatingthatthecelltypesarefrequentlyconfusedwitheach other,suggestingit maybereasonabletomergethem.Clusterswithpoorconfidencevalues(<90%)thatare confusedwithdiverseotherclusters,orthathaveverylowaveragecounts,mayreasonablybedeleted. l Arecelltypesexhibitingtheexpectedimmunofluorescenceresults(e.g.doCD45countsalignwithCD45+ immunofluorescence)?Canyouidentifyknowncelltypesbasedontissuemorphology? l Supervisedcelltyping:arecelltypescorrectlynamed?Dotheyhavetheexpectedspatialdistribution(basedon theXYplot)? l Unsupervisedclustering:whatcelltypesdotheseclusterscorrespondto?Usetheirspatialdistribution(intheXY plot)toassigncelltypestotheclusters. Basedonyourobservations,usethemoduleCellTypeQC- RNAonpage60torename,merge,delete,or subclustercelltypes. 58 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 59] ExpressionModel(CELESTA)-Protein Prerequisitemodules:Normalization Moduledescription:TheProteinCellTyping(CELESTA)algorithmperformscelltypingbytakingintoaccounteach cell'smarkerexpressionprofileand,if necessary,spatialinformation.Celltypingcallsareguidedbyasignature matrixthatspecifiesthemarker(s)knowntohavehighorlowexpressionforeachcelltype.AbimodalGaussian mixturemodelisthenfittoestimatetheprobabilityofeachcellhavinghighexpressionforeachconsideredmarker. Whentheprobabilityissufficientlyhigh,acellisconsideredan"anchorcell".Whentheprobabilityisnotsufficiently hightomakeahigh-certaintycelltypecall,thealgorithmalsoconsidersspatialinformationbytakingintoaccount thecelltypecallsofneighboringcells.Theseareconsidered"indexcells".Theprobabilitythresholdscanbetunedby changingthetuningparameterinputfiletoincrease(ordecrease)thenumberofagivencelltypebydecreasing(or increasing)thehigh_expression_threshold_anchororhigh_expression_threshold_indexforanchorandindexcells, respectively(seeAppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA)on page87).ThismodulerunsthefirststepoftheCELESTAalgorithm,fittingthebimodalGaussianmixture model. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l Separatelymodelbyflowcell(checkbox;shouldseparatemodelsbefittedtoeachflowcell?). l Signaturematrix:thesignaturematrixdefineswhichprotein(s)willbeusedasmarkersforwhichcelltypes.The softwaredefaultstothesignaturematrixmatchingthedatainthestudy(humanormouse).Ifneeded,download thedefaultmousesignaturematrixfromGithub/Nanostring-Biostats/CelestaSignatureLibrary,orcreateacustom signaturematrixforhumanormousedataanduploadittothemodule(seeinstructionsinAppendixII:Createa SignatureMatrixandTuningParameterFileforCellTyping(CELESTA)onpage87).Maxuploadfilesize100MB. Outputvisualizations:XYplot(cellscoloredbyprobabilityforeachproteinfrom0-1)andhistogram(distributionof probabilityforeachprotein). CellTyping(CELESTA)-Protein Prerequisitemodules:ExpressionModel(CELESTA) Moduledescription:TheProteinCellTyping(CELESTA)algorithmperformscelltypingbytakingintoaccounteach cell'smarkerexpressionprofileand,if necessary,spatialinformation.Celltypingcallsareguidedbyasignature matrixthatspecifiesthemarker(s)knownto havehigh/lowexpressionforeachcelltype.A bimodalGaussian mixturemodelisthenfittoestimatetheprobabilityofeachcellhavinghighexpressionforeachconsideredmarker. Whentheprobabilityissufficientlyhigh,acellisconsideredan"anchorcell".Whentheprobabilityisnotsufficiently hightomakeahigh-certaintycelltypecall,thealgorithmalsoconsidersspatialinformationbytakingintoaccount thecelltypecallsofneighboringcells.Theseareconsidered"indexcells".Theprobabilitythresholdscanbetunedby changingthetuningparameterinputfiletoincrease(ordecrease)thenumberofagivencelltypebydecreasing(or increasing)thehigh_expression_threshold_anchororhigh_expression_threshold_indexforanchorandindexcells respectively(seeAppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA)on page87).ThismodulerunsthesecondstepoftheCELESTAalgorithm,assigningcellstocelltypesusinga signaturematrixandtuningparameters. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 59 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 60] Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l Signaturematrix:thesignaturematrixdefineswhichprotein(s)willbeusedasmarkersforwhichcelltypes.The softwaredefaultstothehumansignaturematrix.Foramousestudy,downloadthemousesignaturematrixfrom Github/Nanostring-Biostats/CelestaSignatureLibraryanduploadittothemodule.Alternatively,createandupload acustomsignaturematrixforhumanormouse(seeinstructionsinAppendixII:CreateaSignatureMatrixand TuningParameterFileforCellTyping(CELESTA)onpage87).Maxuploadfilesize100MB. l Tuningparameter:thethresholdssetinthetuningparameterfileinfluencehowtheCELESTAalgorithmcallscell typesbasedonthesignaturematrix.Itdefinesthehigh-andlow-thresholdsforanchorandindexcells.Usethe defaulttuningparameterfile(alreadyloadedinthesoftware)oruploadacustomtuningparameter.csvfile.See instructionsinAppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA)on page87).Maxuploadfilesize100MB. l Maximumneighborhoodradius(µm)(0-100;default30). l Maximumnumberofcellsinneighborhood. l Spatialweight(0-10;default5)(βvalueinPottsmodeltodeterminehowstronglyspatialinformationisweighted incelltyping.0indicatesspatialinformationisignored). l Fastapproximation(yes/no)(useafastandcloseapproximationthatsplitscellsintosmallerspatiallyclustered groupsof~10,000cellspergroupatatime). Outputvisualizations:Additionalcolumnsareaddedtothestudymetadatashowingtheresultsfromeachround ofCELESTAcelltyping(thefinalcelltypingdesignationsareshowninthecolumnfinal_cell_type);XYplotand UMAPcoloredbyCELESTAcelltypelabel. ExploreyourCellTypingdataset:PleaseseethepromptsunderExploreyourCellTypingdatasetonpage 58. CellTypeQC-RNA Prerequisitemodules:CellTyping(InSituType) Moduledescription:RefinesclusterresultsfromInSituTypealgorithmbyrenaming,merging,deleting,and/or subclustering.TheresultsoriginallygeneratedbytheInSituTypealgorithmintheCellTyping-RNA modulewillbe updatedwiththeoutputoftheCellTypingQCmodule. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l MergeFrom/To:selectaclustertomergewithanother(From),andaclusterintowhichtomerge(To).Canalso beusedtorenameacluster(e.g.,"a"to"Tumorcell"). l Delete:selectclusterstodelete.Cellswillbere-classifiedusingthebestfitfromremainingclusters. l Subcluster:selectaclustertobesplitintomultiplenewsubclusters. l n:Specifythenumberofclusters(n)forthesubclusteringstep,ifselectedabove. 60 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 61] Outputvisualizations:VisualizationssuchasUMAPwhichcanbecoloredbytheresultsoftheCellTyping-RNA modulewillbeupdatedtoreflecttheoutputoftheCellTypingQCmodule. WhenyouruntheCellTypeQCmodule,youarealteringtheCellTyping(InSituType)resultsbymerging, subclustering,ordeletingclusters.TheoriginalCellTyping(InSituType)outputwillbeoverwrittenandwillno longerbeavailableforvisualizationanddownstreamanalysis.Inaddition,thepreviouslyspecifiedmodule parameterswillnotbeavailableforrechecking.PleaserecordtheCellTyping(InSituType)parametersused,if needed. NeighborNetwork:ExpressionSpace-RNAorProtein Prerequisitemodules:PCA Moduledescription:ConstructstheKNN(k-NearestNeighbor)graphbasedontheEuclideandistanceinPCA space,thenconstructstheSNN(SharedNearestNetwork)graphwithedgeweightsbetweenanytwocellsbased onthesharedoverlapintheirlocalneighborhoods(Jaccarddistance)andpruningofdistantedges. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l Jaccardcutoff:setsthestringencyofpruningthedataset,fromavalue0(nopruning)to1(totalpruning).This valueisthethresholdorcutoffforacceptableJaccardindexwhencomputingtheneighborhoodoverlapforthe SNNconstruction.Anyedgeswithvalueslessthanorequaltothisvaluewillbesetto0andremovedfromthe SNN graph.Default:0.06,validrange:0-1. l Distancemetric:Euclidean(default),Cosine,Manhattan,orHamming. ReadmoreaboutdistancemetricsinmachinelearninginEhsaniandDrabløs(2020). Outputvisualizations:None(theoutputdatasetisusedtorunLeidenClusteringbutisnotvisualizeditself). LeidenClustering-RNA orProtein Prerequisitemodules:Neighbornetwork:expressionspace Moduledescription:Leidenclusteringisanunsupervisedclusteringmethodthatisusedtoidentifygroupsofcells whicharerelatedbasedonhowsimilartheyareinagraphstructure.Clustersaredefinedbymovingcellstoidentify groupsof cellsthatcanbeaggregatedwithoutchangingtheoverallrelationshipof thegraphandlookingfor unstablenodeswhichserveasbridgesbetweenrelatedcommunitiestohelpdefinetheboundariesofdifferent clusters.Theresolutionthatyouselectwilldeterminetheoverallnumberofclustersidentifiedafterrunningthe algorithm,withlowernumbersidentifyingfewerclusters,andhighernumbersidentifyingmore. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 61 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 62] Inputparameters:Resolution(theoverallnumberof clustersidentifiedafterrunningthealgorithm,wherea lownumberisfewer,largergroupsandahighnumber ismore,smallergroups(default1,range0.2-3). Outputvisualizations:Leidenclusterannotationis generatedandincludedinstudymetadata.If UMAP hasbeenrun,thedefaultvisualizationoftheUMAP plotwillbeto colorbyLeidenclusters(Figure43). LeidenclusterscanalsobeusedtocoloranXYplot. SeeCosMxSMIDataVisualizationsonpage70. Figure43:UMAPwithcolorcodingbyLeidenclustering ExploreyourCellTypingdataset: l Evaluateeach cluster's spatial distribution, expressionprofile,andimmunofluorescencevalues. To do so, includethe PipelineStructurepanel, PipelineDatapanel,andImageViewerpanelinyourDataAnalysisSuiteview,thenselecttheLeidenClustering moduleinthePipelineStructurepanel.YoumayalsooverlaycelltypesonthetissueintheImageViewer panel:selecttheflowcellandFOV(s)todisplayfromthedropdownmenus;selectCellstooverlaycelldataand Step:LeidenClustering. l Arecelltypesexhibitingtheexpectedimmunofluorescenceresults(e.g.doCD45countsalignwithCD45+ immunofluorescence)?Canyouidentifyknowncelltypesbasedontissuemorphology? l Whatcelltypesdotheseclusterscorrespondto?Usetheirspatialdistribution(intheXYplot)toassigncelltypes totheclusters. l Evaluateanypreviously-generatedUMAPwiththenewoptiontoColorby:LeidenClustering. IdentifyMarkerGenes-RNAorProtein Prerequisitemodules:CellTyping,LeidenClustering,orNeighborhoodAnalysis Moduledescription:Thismoduleidentifiesmarkersassociatedwitheachcelltypeorclusterpreviouslyidentified inthedataset.It looksforgenesthatareexpressedabovebackgroundconsistently,butalsomostspecifically restrictedtoeachcelltypeorclusterwithinthedataset.Themoduleactsoneachgeneindependently. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters:GeneListName(iftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailable toselectforinputtotheIdentifyMarkerGenesmodule).If'Allgenes'isselected,thismodulewillrunonallgenes eveniftheupstreamNormalizationmodulewasrunonaGeneSelectionsubset. Outputvisualizations:Resultsmatrixwhichconsistsofgenesxcelltypes(onevalueforeachcelltype/genepair; valuesareaverageestimatedvalueofgenewithinallcellsmatchingtheID);heatmapofmarkergenesvs.celltypes scaledacrosscelltypessuchthattheheatmapvalueisthez-scoreofexpressionacrossallcelltypesforagiven gene.SeeCosMxSMIDataVisualizationsonpage70. Exploreyouridentifiedmarkergenes:Dowell-characterizedcelltypemarkersappeartobeexpressedonlyin theircanonicalcelltypes? 62 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 63] NeighborhoodAnalysis-RNA orProtein Prerequisitemodules:CellTypingorLeidenClustering Moduledescription:Thismoduleidentifiesdistinctcellularneighborhoodclusters(niches)basedoncelltype compositionandXYcoordinates.Thismodulehelpsdefinethestructuralcompositionofatissuebylookingfor regionaldifferencesincelltypecomposition.Nichescanberepeatedstructuresthatarefrequentlyfoundwithina tissuebutwhicharenotcontiguous(e.g.glomeruliinthekidney,germinalcentersinthelymphnode)orwhichare physicallyconnectedacrossatissue(e.g.epitheliallayerinthecolon). Custompipelinemodulename(optional):Usedto identifymoduleresultsin downstreamanalysis;seefull descriptiononpage 47. Inputparameters: l Method(eitherRadius(µm)to capturenearestneighborsin space(default:50µm,range:10-500µm)or NeighboringCellstoindicatethenumberofnearestneighborstoevaluate(default:250,range:10-500)). l Numberofneighborhoods(clusters)desired:default:10,range:integers≥ 3. Outputvisualizations:Thereisnotaspecificvisualizationforthismodule,butothervisualizationslikeXYplotor UMAPcanbecoloredbytheNeighborhoodAnalysisresults,andNeighborhoodAnalysismoduledatacanbe overlaidonthetissueintheImageViewerpanelbyselectingCells,CellType,andStep:NeighborhoodAnalysis. SeeCosMxSMIDataVisualizationsonpage70. Ligand-Receptor(LR)Analysis-RNA Prerequisitemodules:LeidenClusteringorCellTyping(InSituType) Moduledescription:Scorespairsofcellsandindividualcellsforligand-receptorsignaling.Ligand-receptortarget expressioninadjacentcellsisusedtocalculateaco-expressionscore.Atestisthenperformedtodetermineifthe overallaverageofthescoresforeachligated-receptorpairisenrichedbythespatialarrangementofcells.Specific celltypescanbedefinedfortheanalysis.Notethata pipelinethatincludesLRAnalysiswillpausepriorto thismoduletoallowtheusertodesignatetheLeidenClusteringorCellTypingdataastheinputtoLR Analysis. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters:Ligandexpressingcelltype(s);Receptorexpressingcelltype(s);Receptorexpressingcelltype permutations(default100;mustbeaninteger>0);Calculationmethod(directionalornon-directional).(Directional countsL1:R1asdistinctfromR1:L1(twopairs)whereasnon-directionalcountsthoseasonepair). Outputvisualizations:HeatmapwitheachLRpaironthey-axisandflowcellnamesonthex-axis.Significant enrichmentscoresarecoloreddistinctlyfrominsignificantenrichmentscores.SeeCosMxSMIDataVisualizations onpage70. (AresultsmatrixofaverageLRscoreandsignificanceofspatialenrichmentforallLRpairsinthe selectedcelltypesisincludedinthetiledbobject,availablewithdataexport,butisnotavailableinthesoftwareuser interface). FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 63 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 64] SpatialNetwork-RNA orProtein Prerequisitemodules:QualityControl Moduledescription:Createsanetworkorgraphstructureofthephysicaldistributionofcells.Cellsareconverted to nodesinthegraph,andconnectionsbetweencells(e.g.nearestneighbors)arerepresentedasedges.The networkcanbebuiltin oneof threeways:radius-based(allcellsconnectedwithina givenradius),nearest neighbors,orDelaunaytriangulation. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters:Methodofbuildingthenetwork(distance(default),nearestneighbors,orDelaunay): l If "distance"method,selectradius(µm):radiustoselectcellstocreateedges(default:20µm, range:10µm-100µm). l If"nearest"method,inputnumberofnearestneighbors(cells)toevaluate(default:5,range:1-50, integersonly). Outputvisualizations:Anadjacencymatrixwithdimensionsnumberofcellsx numberofcells. Eachedgeis recordedinthematrixastheeuclideandistancebetweenthecells.Novisualizationsareavailable,butthismodule's outputcanserveastheinputtoothermodules. CellTypeCo-Localization-RNAorProtein Prerequisitemodules:CellTypingorLeidenClustering Moduledescription:Examinesthetendencyofdifferentcelltypestobelocatedneareachother.Eachpairofcell typesdefinedfromsupervisedorunsupervisedclusteringistestedusingRipley’sK-function(afunctionofthe distancebetweenthedifferentcelltypes)forwhetherthecells’spatialdistributiondiffersfroma theoretical Poissonpointprocesswherea cell’slocationis notdependentonanothercell’slocation.Theresultsare summarizedinaheatmapindicatingwhichcelltypestendtoclustertogetherorisolatefromeachother.Inaddition, amoregranularviewisshownwhenplottingthepaircorrelationfunctionforagivencelltypepairingasafunctionof theradius,whichcanrevealspecificdistancesatwhichthecellsofeachtypeareco-localized. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters:Radius(radiuswithinwhichtoevaluateneighborcelltypes;0-300µm;default100µm);Stratify resultsacrossFlowcellsorFOVs. Outputvisualizations:HeatmapshowingnetdifferenceacrossinputradiibetweentheoreticalandobservedK- functionvalue;positiveinredindicatingclustering,negativeinblueindicatingseparation. 64 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 65] PathwayAnalysis-RNA Prerequisitemodules:Normalization Moduledescription:Signalingpathwayanalysisiscalculatedonaper-cellbasisusinggenesetsofpre-defined pathways.TheRpackageAUCell(Aibaretal.2017)isusedtocalculatetherelativeexpressionofdifferentgene setswithinacell,anddeterminewhetheragenesetcorrespondingtoaparticularpathwayisenriched.Genesets whichdonothavesufficientcoverage(20%ofgenesingenesetpresentindataset)areexcludedfromanalysis. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters:Uploadagenesetfileinformat.gmtusinggenesymbolsastheinput.Learnmoreabout.gmt fileformatintheGeneSetEnrichmentAnalysisarticlelinkedhere. Outputvisualizations:Acells-by-genesetmatrixwithestimatedgenesetscoreforeachcellisaddedtothestudy metadataandaccessiblethroughdataexport.VisualizationslikeXYplotorUMAPcanbecoloredbythePathway Analysisresults,andPathwayAnalysismoduledatacanbeoverlaidonthetissueintheImageViewerpanelby selectingCells,CellType,andStep:PathwayAnalysis.SeeCosMxSMIDataVisualizationsonpage70. SpatialExpressionAnalysis-RNA Prerequisitemodules:CellTyping(InSituType) Moduledescription:Identifygeneswithspatiallydependentexpressionpatterns.Thismoduleidentifiesgenes whichhaveaspatialdistributionthatisnon-uniformthroughoutatissue,andwhichmaybeassociatedwithspecific tissuestructures,microenvironmentniches,orcelltypes.Themodulealsomeasuresassociatedspatialexpression betweengeneswhichcanbeusedtogroupgenesintodifferentspatialexpressionpatterns.Thetwostatistics calculatedrelatedtospatialexpressionpatternsareMoran'sIandLee'sL. Thismoduledoesnotassumeanyspecificrelationshipbetweenstructuresinthetissue.Geneswithsignificant countvaluesshouldbevisualizedtodeterminehowtheyarerelatedtothetissuemorphology. If theinputdatasetexceeds2000genes,themodulerunsonthemostvariable2000genesascalculatedbythe module. Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull descriptiononpage 47. Inputparameters: l Neighborhoodsize(numberofcellsinspatialnetwork:default10;mustbeaninteger>0). l ReceptorExpressingCellType(selectcelltype(s)ofinterestforthespatialexpressionanalysis.Celltypeswill becomeavailableaftersuccessfulupstreamclustering,andthatclusteringwilldictatethecelltypesavailablein thismenu.IftheboxCelltypesrequiredischecked,thenaselectionmustbemadeinthecelltypesdropdown menu.Ifitisunchecked,thepipelinecontinueswithoutrequiringcelltypeinput). l GeneListName:IftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailabletoselectforinput totheSpatialExpressionAnalysismodule.Thegenelistwillbeusedevenifitexceeds2000genes.If'Allgenes' isselected,thismodulewillrunonthemostvariable2000genesascalculatedbythemodule. Outputsandvisualizations: FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 65 MAN-10162-11 CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules [Page 66] l AresultstablewithMoran'sIvaluesforeachgeneandresultsfromtheMonte-Carlotestforsignificanceofeach Ivalue. l Lee'sLassociationmatrixwithgenebygenemeasuresofspatialassociation. TheseoutputsarenotcurrentlyaccessiblefromAtoMx,butcanbeaccessedfromanexportedtileDBarray.For moreinformationabouttheoutputofSpatialExpressionAnalysis,refertotheCosMxSMI LiverPublicData Releasesubsectionat https://nanostring.com/wp-content/uploads/2023/01/LiverPublicDataRelease.html#311_ Spatial_Expression_Analysis. DifferentialExpression(DE)-RNA Prerequisitemodules:LeidenClusteringorCellTyping Moduledescription:ThismoduleperformsDifferentialExpressionanalysisusinggeneralizedlinear(mixed) modelsforsinglecellexpression.Thismoduleallowstheusertocontrolfortheexpressionofneighboringcellsby including'neighboringcellexpression'oftheanalyzedgeneasafixed-effectcontrolvariableintheDEmodel. Controllingforexpressioninneighboringcellsismotivatedbytheobservationthatcellsincloseproximityona tissuearenotindependent,andcomparisonsofDEbetweengroupsmaybeaffectedbycell-segmentationandcell- typeuncertainty.Often,single-cellDEanalysesmaytestwhethergenesaredifferentiallyexpressedwithina specificcell-type.Inpractice,imperfectcellsegmentationcanresultinoverlaporbleed-overoftranscriptsfrom neighboringcells,whichcanalsoincreaseuncertaintyindownstreamcell-typinganalyses.Forthesereasons, includingtheexpressionoftheanalyzedgeneinneighboringcellscanbeausefulcontrolvariable.Twoapproaches areimplementedintheDEmoduletohandlethisissue,configurableinAdvancedParameters: 1. Anoverlappingcellsmetric:Usedto identifygeneswhichmaybeexpressedwithinspecificcelltypesdue primarilytooverlappingcellsorsegmentationerrors,andexcludethemfromcell-typespecificDEanalysis.The metriccomputestheaverageexpressionin theselectedcelltype,andtheaverageexpressionin spatial neighborsoftheselectedcelltype(onlyconsidering"other"celltypes).Theratioofthesetwoaverageexpression vectorsisaquickandusefulwaytodiscardimplausiblegenes. 2. Covariateadjustment:Computethetotalexpressionof thegeneof interestinthespatialneighborsof the selectedcelltype(onlyconsidering"other"celltypes),andincludethisasacontrolvariableintheregression model. Forexample,ifanalyzingaparticularcelltypelikeTcells,andsomegeneishighlyexpressedinneighboringcellsof "other"celltypes(non-Tcells),thenitmaybeprudenttoexcludethatgenefromanalysisbecauseitmaybeproneto falseexpressionfromoverlappingorimprecisesegmentation.Thismoduleallowstheresearcherto configure settingstoadjustDEanalysisaccordingly. It isrecommendedtoanalyzetherateofgeneexpressionpercelllibrarysize(totaltranscriptcountsacrossall genes)usingeither: 1. Rawcounts,usingnegativebinomialdistributionwithlibrarysizeasanoffset(thisisthedefault optionintheDEmoduleinAtoMxSIP),or 2. Normalizedcellexpression(wherenormalizationmethodtakesintoaccountthecelllibrarysize), usingalinear(mixed)model/ gaussiandistribution.Moreinformationonnormalizationisinthe CosMxAnalysisScratchSpacepostonQCandNormalizationofRNAData. 66 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 67] Custom pipeline module name (optional): Used to identify module results in downstream analysis; see full description on page 47. Input parameters: Basic: l Flow cell or biological sample annotation. Default selection: Run_Tissue_name (when determining the neighbors of a cell in physical space, this parameter ensures that cells with similar spatial coordinates are not considered 'neighbors' unless they come from the same flow cell/sample (or other trait as selected from the dropdown menu)). l Filter Cells by metadata: In v2.1, this module now supports multiple metadata-based cell filters. Some examples are listed here: l To run DE on a particular InSituType cell type, click Add Filter, then select RNA_Cell_Typing_InSituType from the Include cells dropdown. The Analyze cells dropdown populates with the cell types. Select the desired cell types to analyze. l To filter based on QC flags related to cell counts, click Add Filter, then select qcFlagsCellCountsfrom the Include cellsdropdown . From the Analyze cellsdropdown, select the inclusion criteria, such as Pass or Fail. l If the Sample Metadata .csv file has been edited using the GetSampleMetadata and UpdateSampleMetadata custom modules to include a new column of annotations, that column name may be selected here to filter on that criteria. For example, if a new column was added such as "tissue layer", with each row of the column containing "crypt", "submucosa", or blank, select "tissue layer" in this dropdown menu to filter based on this parameter. In the Analyze cellsdropdown , select the subgroup you wish to include in the DE analysis. l Genes/Proteins (multi-select; leave blank to analyze all genes. If selecting individual genes/proteins to analyze on, the software limits to 200 items). l Distribution family (select nbinom2, gaussian, or poisson - the parametric family to be used for the regression model). l Variable to use for Volcano Plot. Select one variable of interest. Summary outputs will be generated in .csv file format for all models in DE variable, which can be downloaded and used to create additional figures and plots as needed. l Edit the Model Formula, if necessary. The term "otherct_expr" ("other cell type expression") is a default covariate, which is calculated as the expression of the analyzed gene inneighboring cells of other annotation types. See the Advanced tab for more options related to this covariate. Do not select cell_ID or CellID in the Model Formula or an error may occur. Advanced: l Neighbor expression category (select non-numerical variable of interest). For "otherct_expr" default covariate; metadata annotation (usually cell type) used to compute the expression of each gene inneighboring cells of other annotation groups. For an example cell with annotation 'a', compute the neighbor expression of each gene in neighbor cells which are not of annotation 'a'. l Neighbor bandwidth (default 50 microns; range 1-100). For "otherct_expr" default covariate; upper limit for distance at which to consider a neighboring cell a "neighbor". FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 67 MAN-10162-11 CosMx SMI Data Analysis User Manual CosMx SMI Pipeline Modules [Page 68] l Maxoverlapratiometric(default1).Foreachannotationin"NeighborExpressionCategory"(seeabovebullet), computetheaverageexpressionofeachgeneintheneighborsofcellsofthatcategory(amongneighborswhich arenotthesamecategory)andcomputetheaverageexpressionofeachgenewithintheannotation. Overlapratioisthendefinedastheratio: (AvgExpressioninCellsofOtherAnnotations)/(AvgExpressionincellsofAnnotation) Ratios>1indicatehigherexpressionincellsofothercategoriesthantheindexedcategory,andhenceDEgenes aremorelikelytobespuriouslyassociatedduetosegmentationuncertainty.Bydefault,1isthecutoffusedfor includingageneintheanalysis.Settoalargenumbertoremovefiltering. l Neighborexpressionweightedbydistance(selectNoneorWeight).For"otherct_expr"defaultcovariate;Should neighborexpressionofthegeneinneighboringcellsbeweightedbydistance?"Weight"correspondstoyes, Nonecorrespondstoequalweightsregardlessofdistance. l Methodtoaggregateneighborexpression(selectMeanorSum).For"otherct_expr"defaultcovariate;Should neighborexpressionofthegeneinneighboringcellsofothertypesbesummed(sum)oraveraged(mean)? l Normalizeneighborexpression(selectTrue[recommended]or False).For"otherct_expr"defaultcovariate; Shouldneighborexpressionofthegeneinneighboringcellsofothertypesbenormalizedbythetotalcounts? (Thisisrecommended). Outputvisualizations:Volcanoplot(log2foldchanges(x)against-log10(p-values)(y)).Toaccessthevolcanoplot, clicktheimageicon onthesuccessfully-executedDEmoduleinthePipelineStructurepanel(seeCosMxSMI DataVisualizationsonpage70).Datatablessummarizingtheresultsarealsosavedtothetiledbdataset.Tocreatea heatmapfromDEdata,pleaserefertoinstructionsintheCosMxSMI HumanLiverFFPEDatasetVignette. Novae-RNA(SpatialDiscoverystudiesonly) Prerequisitemodules:IntegratedintothefoundationalanalysispipelineinaSpatialDiscoverystudy;notavailable toaddtoacustompipelineinaclassicconfigurationstudy. Moduledescription:NewinAtoMxSIPv2.2,theNovaemoduleintegratesthepre-trainedfoundationmodelsfor spatialdomaindiscoveryinspatialtranscriptomicsdatasetsviatheNovaealgorithm(seeAppendixI: Literature Referencesonpage86). UnliketheexistingSpatialCluster(NeighborhoodAnalysis)modulethatis basedon neighborhoodcell-typecomposition,theNovaealgorithmlearnsspatialdomainsfromexpressionandspatial adjacencyusingagraphneuralnetworkwithself-supervisedcontrastlearningandthusenableslabel-freediscovery ofspatialdomainswithouttheneedforpriorcell-typeannotation.Themodelsweretrainedwithmillionsofcells acrossdiversetissuetypesandcanidentifyspatialdomainscapturingtissuearchitectureandheterogeneityeven withzero-shotinference. IftheNovaemoduledoesnotexecutesuccessfully,it canbere-triedbyclickingtheplaybuttononthemodulein thePipelineStructurepanel.Whenthemodulecompletessuccessfully,reruntheSpatialDiscoverymoduleto obtainupdatedresults.IftheNovaemodulehasnotexecutedsuccessfully,thentheSpatialDiscoveryviewreflects resultsfromtheNeighborhoodAnalysismoduleforitscoloring. ProcessingtimeforthismoduleishighlydependentonAtoMxSIPhubtraffic,soitisdifficulttoestimateruntime. Forreference,aWTXstudywith1.6millioncellsrequired4.25hoursfortheNovaemoduletocomplete. Inputparameters:Themodulerunsondefaultvalues,soitisnotconfigurableatthistime. 68 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual CosMxSMIPipelineModules MAN-10162-11 [Page 69] Output visualizations: Spatial Discovery image view with cell coloring by niche (see Spatial Discovery View: Data Overlay on page 29). If the Novae module does not execute successfully, then the results of Neighborhood Analysis will be used to color cells in the default Spatial Discovery view. Spatial Discovery - RNA (Spatial Discovery studies only) Prerequisite modules: Integrated into the foundational analysis pipeline in a Spatial Discovery study; not available to add to a custom pipeline in a classic configuration study. Module description: New in AtoMx SIP v2.2, the Spatial Discovery module leverages results from upstream modules in the foundational data analysis pipeline to generate a downloadable Spatial Discovery data package. This data package includes plain text files containing marker gene summaries and pathway summaries faceted by Leiden cluster and spatial domain (or, in the event no Novae results are available, neighborhood niche assignments). These files can be further analyzed in your statistical language of choice (e.g., R, python) as well as uploaded to your favorite Large Language Model (LLM) for continued conversational exploration of your CosMx SMI results. Input parameters: The module runs on default values, so it is not configurable at this time. Output: Module creates the downloadable Spatial Discovery package, which can be used as input to Large Language Models for further data exploration The Download icon ( ) on the Spatial Discovery module in the Pipeline Structure panel downloads the same package as the Spatial Discovery button in the Data Overlay panel (described on page 29). FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 69 MAN-10162-11 CosMx SMI Data Analysis User Manual CosMx SMI Pipeline Modules [Page 70] CosMx SMI Data Visualizations Study Statistics Table Displays the Number of FOV, Mean transcripts per cell, Mean unique genes per cell, Number of non-empty cells, 10th percentile transcript per cell, 90th percentile transcript per cell, and Mean Negprobe counts per cell (Figure 44; see alsoTable4 on page 46). Click thearrow (carat) to expand the list of FOV in the selected flow cell. Available for all studies based on Initial Data. QC Metrics Table Displays the pass/fail metrics for the QC module run (Figure 45). RNA QC metrics are defined inTable5 on page 49and Protein QC metrics are defined inTable6 on page 50. Figure 45: Pipeline Data panel - QC metrics Figure 44: Study statistics table in Pipeline Data panel 70 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual CosMx SMI Data Visualizations MAN-10162-11 [Page 71] XY Plot Displays data output of the selected module in X,Y space (Figure 46). Available for the modules QC, Normalization, Cell Typing (InSituType), Cell Typing (CELESTA), Leiden Clustering, Neighborhood Analysis, and Pathway Analysis. From the dropdown menus, select the FOV, color codingmethod (count, sum, or average), and theclustering step to plot. Click the arrow (carat) to display more customizations, including the specificgene(s) to plot, cell type(s), density, honeycomb, or scatter view; and tools such as area selection and zoom. Optionally, enable automatic scatter view when the number of points displayed is less than 10,000. Choose from availablecolor palettes from the dropdown menu. Certain data from an XY Plot can also be overlaid on the tissue itself in the Image Viewer panel. SeeRecommended Data Overlays and Interactivity using the ImageViewer Panelon page 33. Figure 46: XY plot with honeycomb view in Pipeline Data panel Heatmap Figure 47: Heatmap visualization of QC module data Displays data output of the selected module as a heatmap, sorting by FOV and targets (Figure 47). Available for the modules QC, Normalization, Cell Typing (InSituType), Identify Marker Genes, Ligand-Receptor Analysis, and Cell Type Colocalization. The heatmap header options will differ depending on the heatmap and the module it represents. For heatmaps from modules such as QC or Normalization, select theFOV to visualize from the dropdown menu. Use the buttons to select Linear or Log2 scaled data; display therow and/or column names; and choose the fit of the data displayed in the panel. Click the arrow (carat) for more options, including Zoom, Save, and adjustments to the color palette. For heatmaps from modules such as Identify Marker Genes, toggle between displaying All Genes or Top Markers. If Top Markers, select the number of markers to display using the slider bar. You may select additional markers to display (even if not a top marker) from theGenes dropdown list. Heatmap name and axis names can also be edited. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 71 MAN-10162-11 CosMx SMI Data Analysis User Manual CosMx SMI Data Visualizations [Page 72] Box Plot and Violin Plot Displays data output of the selected module as a box-and-whisker or violin plot (Figure 48). (Violin plot only available for the Normalization module.) Select FOV(s) for data visualization from the dropdown menu. Use the buttons in the box plot header to selectLinear, Log2, or Log10scaled data; andhide/ displaypoints.Click the arrow (carat) for more options, including a toggle between box, violin, or combination display; minimum expression value threshold; and custom plot title and axis names. To export box plot data, click the Save icon in the top right of the chart. Figure 48: Pipeline Data panel - box plot Histogram Figure 49: Pipeline Data panel - histogram Displays the Number of Cells (y-axis) with a particular Counts per Cell value (x-axis) (Figure 49). The histogram header options will differ depending on the heatmap and the module it represents. Available for the modules QC, Normalization, and Expression Model (CELESTA). For histograms from modules such as QC or Normalization (Figure 49), select genes of interest from the dropdown menu. If cell typing has been performed, select certain cell types under the second dropdown menu. If desired, adjust the bins number (how many categories the x-axis data is sorted into). Choose between Linear, Log 2, or Log 10 scaling. Click the arrow (carat) to rename the axes and change the bin color and opacity. Click the Save icon in the top right of the chart to export histogram data. 72 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual CosMx SMI Data Visualizations MAN-10162-11 [Page 73] PCA Plot Displays the 2D representation of Principal Component Analysis as a scatter plot with default axes Principal Component 1 (PCA_ 1) and Principal Component 2 (PCA_2) (Figure 50). Select alternative axes from the Components dropdown menu. If cell typing has been performed, select particular cell types from the third dropdown menu. Click the arrow (carat) to access the selection tools. Figure 50: Pipeline data panel - PCA UMAP Plot Displays the UMAP analysis for all FOVs and flow cells in the study as a scatter plot (Figure 51). Select aColor by option from the dropdown menu: morphology marker expression, flow cell, total counts, or (depending on which modules have been run) Leiden clusters, cell types, or spatial neighborhoods or pathways. Selectcell types/ clusters to plot from the dropdown menu. Toggle Enable selectionto on to allow the selection of data point(s) in the graph using a lasso, square, circle, or rectangle annotation tool. The tools appear as icons on the top-right of the UMAP (you may need to hide the header to see them). Click the arrow (carat) to access additional visualization settings. These settings are based on the concept oftiles, which can be thought of as ann x n grid that make up the display. Select the data reduction method (see the gray box, below). Adjust the Tile Count (number of tiles comprising the display), Tile Capacity, Max Data Points, Points size, and Points Transparency, as desired. These are questions of aesthetics and personal preference. Figure 51: Pipeline Data panel - UMAP Data reduction is required because there are not enough pixels on any screen to display the very large number of points making up the UMAP. The method of data reduction can impact the shape of the UMAP and conclusions drawn from it. Therefore, control over the method of data reduction is left to the user. Normalization method normalizes data in each tile of the display; saturation method sets a maximum number of dots per tile. It may be a matter of aesthetic preference for the user. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 73 MAN-10162-11 CosMx SMI Data Analysis User Manual CosMx SMI Data Visualizations [Page 74] Volcano Plot Displays the results of the Differential Expression module by plotting log2 fold changes on the x-axis against -log10(p-values) on the y-axis. Access the volcano plot visualization as an HTML file after successful execution of the Differential Expression module. Click on the image icon on the module in the Pipeline Structure panel to download the file. Figure 52: Example of a volcano plot showing results of Differential Expression analysis. Flightpath Plot Illustrates the tendency of different cell types to be confused with each other ( Figure 53). This plot type displays cells in groups as a function of their probability of being a particular cell type. Each cell type is given a centroid, and placed near other cell types with similar profiles. Then, each individual cell is placed based on its probability of belonging to each centroid. For example, cells with 100% confidence are placed directly atop their centroid, and a cell with 50% confidence in two cell types will be placed directly between their centroids. Figure 53: Example of a flightpath plot. Dots between centroids represent cells with ambiguous identity. Access the flightpath plot as a .PNG file after successful execution of the Cell Typing (InSituType) module. Click on the image icon on the module in the Pipeline Structure panel to download the file. Flightpath plots generated by the Cell Typing (InSituType) module are labeled with cluster identification (a, b, c...) and confidence score (average of the cluster's cells' probability of belonging to that cluster). Clusters can be renamed using the Cell Type QC module (see sectionCell Type QC - RNA on page 60). 74 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual CosMx SMI Data Visualizations MAN-10162-11 [Page 75] Save a Visualization If you modify a visualization's default settings, then navigate away from the visualization, the software will prompt you to save the visualization settings (Figure 54). Click Cancelto navigate away without saving the settings. To save the settings, enter a settings name and click Save. Once saved, the visualization is available from the dropdown menu at the top of the Pipeline Data panel (Figure 55). Figure 55: Saved visualization settings available in dropdown menu Figure 54: Save visualization settings prompt FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 75 MAN-10162-11 CosMx SMI Data Analysis User Manual Save a Visualization [Page 76] Export Images Open the Image Viewer panel and select the Export tab from the Image Viewer menu (Figure 56). Select from options to export the full image or the on-screen view, and customize the appearance, format, and quality. Individual image layers can be selected or deselected. Cell segmentation in the export is based on its visibility in the Image Viewer. To exclude the Preview scan from the export, disable (unselect) the channels for this layer on the right side of the Image Viewer. Figure 56: Export Images from the Image Viewer Exported images are downloaded directly, and a notification with link appears in the AtoMx SIP notifications pane. Please note that the link from the notifications pane expires 6 hours from the time of download. External users may not export images or data. If exporting file type .jpg results in an error, exclude the scalebar by unchecking its box (Figure 56) and try again. 76 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Export Images MAN-10162-11 [Page 77] Export Data The built-in export function exports decoded data files, Seurat object(s), corresponding TileDB array, and/or flat .csv files. The decoded data comprises transcript counts and locations, annotation metadata, and user-initiated data transformations performed in AtoMx SIP prior to export. All results up to the point of export will be available in the Seurat object and TileDB array. While the RNA and Protein studies share the same format, the structure of the Additional Files folder will vary based on the analyte. Data is exported on a per-study basis (Seurat objects are split by flow cell). Beginning in AtoMx SIP v2.0, exported data can be directed to an AWS S3 bucket or a tenant-specific host location which is accessed by sFTP client to download data locally. Data is retained in the tenant-specific host location for 2 weeks from export. In AtoMx SIP v2.0, both methods of data export generate an md5sum file for checking the integrity of the export job. External users may not export images or data. To export data, 1. In the study of interest, click Export from the Study Details Panel (Figure 57) to launch the Export Dataset dialog. Figure 57: Export button in Study Details panel Figure 58: Export Dataset dialog, Input Parameters tab 2. In the tab Input Parameters, select the files and/or objects to export (Figure 58). Refer to Table 8 for file descriptions. To prevent the duplication of files and associated costs, export decoded files only once per study. If desired, enter a value for the maximum allowed export timeout duration (4 - 96 hr; default value is 48 hr; the export will stop if it exceeds the duration selected). FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 77 MAN-10162-11 CosMx SMI Data Analysis User Manual Export Images and Data [Page 78] Flat CSV Files Count matrix Count matrix file (cells with gene counts) Cell metadata Cell metadata file (local (within the FOV)/global (within the flow cell) cell coordinates, morphology marker intensities) Transcript Global transcripts file (global coordinates for every individual transcript on the flow cell; not applicable for protein studies) Polygons Global cell boundaries file (global coordinates for the vertices of the polygon that represents the cells' boundaries - a representation of the cell segmentation) FOV positions Global FOV position file (coordinates of the top left of each FOV) Tertiary Analysis Objects Seurat object(s) Seurat object(s) comprised of counts, metadata, and dimensional reduction outputs. One Seurat object is exported for each flow cell in the study. Transcript coordinates The exported Seurat object(s) will include transcript coordinates Polygon coordinates The exported Seurat object(s) will include polygon (cell segmentation) coordinates TileDB array Default tertiary analysis structure Additional Files (WARNING! Large data. Including these files will significantly increase export folder size.) Additional Files Not recommended to export - large data. Morphology2D folder Morphology images - large data. Other miscellaneous data files If available - large data. Table 8: Files for export 3. In the Export Access tab of the Export Dataset dialog, select sFTP or S3 (Figure 59). l For sFTP access, copy the credentials and Output Folder Name provided. l For S3 access, provide the S3 file path in the format s3://bucket/object/, AWS keys which have write capabilities to this S3 bucket, AWS region in the format us-west-2, and session token (if configured). Refer to Appendix III: Setupto Export Data to an AWS S3Bucket on page 90 for more information. Please note that large exports may exceed the 12-hour limit of AWS session tokens. 78 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Export Images and Data MAN-10162-11 [Page 79] 4. Click Export. Export progress is updated in the Study Details panel. Once complete, view and download export job logs from the link Show Export Details. Figure 59: Export Dataset dialog, Export Access tab5. To access data exported to the tenant-specific host location using a secure file transfer protocol (sFTP) client: a. If you don't yet have an sFTP client, download one such asWinSCP . Consult with your institution's IT team to ensure compliance with internal policies. b. Open program and clickNew Site (Figure 60). Steps may differ if using a program other than WinSCP. c. Enter the following information and click Login. l Host name: copy exactly from Export Dataset dialog (Figure 58). l Port number: 22 Figure 60: WinSCP connection to exported data l Username and password: copy username exactly (case-sensitive) from Export Dataset dialog (Figure 58 ) in the format username@ tenantname.com. Password is the one used to access AtoMx SIP. d. Once connected, the available exported files are displayed in the right pane (Figure 61). Download specific studies by selecting them and clickingDownload, or move folders or files to a local folder selected in the right pane. Refer toExport Data on page 77 for details about output. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 79 MAN-10162-11 CosMx SMI Data Analysis User Manual Export Images and Data [Page 80] Figure 61: Exported data in WinSCP window Data are retained in the tenant-specific host location for 2 weeks from export. Please be aware of these considerations when exporting data or troubleshooting data export: l If export to S3 bucket fails, confirm that the AWS credentials used have Write permissions for the bucket. l If receiving the error "Previous export failed" and the export details indicate "Pod terminated preemptively", or the export job logs indicate "ExpiredToken", the export job exceeded the timeout duration set by the user or set by AWS for S3 sessions (12 hours). Please break up the export into smaller jobs by reducing the number of flow cells in the study or selecting fewer files to export from the Export Dataset dialog (Figure 58). l If md5sum checksum values don't match, run the export job again. If the values still don't match, contact support.spatial@ bruker.com for support. l An alternative method of downloading study data from AtoMx SIP is to use the CosMxDataDownloader, available upon request. This Python application requires proficiency with command line. Contact support.spatial@ bruker.com for more information. l If the study is too large to reliably include transcripts to the Seurat object, a transcript flat file is generated instead. l For data exported for access by sFTP, Org Admins can see previous export activity of all users, but export output is sent only to the sFTP folder of the user who initiated the export. 80 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Export Images and Data MAN-10162-11 [Page 81] Working with Exported Data These computational packages are required to interact with exported data objects: l Seurat: R Toolkit for Single Cell Genomics. Install in RStudio usinginstall.packages("Seurat") l tiledbsc: an R implementation of the Stack of Matrices, Annotated (SOMA). Install in RStudio using remotes::install_github("tiledb-inc/tiledbsc", force = TRUE) l tiledbr: an R interface to the storage engine of TileDB. Install in RStudio using remotes::install_github ("TileDB-Inc/TileDB-R", force = TRUE) For additional resources on analyzing CosMx SMI data outside of AtoMx SIP, please refer to the Biostats CosMx Analysis Scratch Space and accompanying blog, hosted on Github. For a detailed walk-through of exported CosMx SMI data, see theCosMx SMI Liver Dataset vignette. The following diagrams illustrate the structure of data exported using the Export function. (Please note that the structure is subject to change, so this information is provided as general guidance.) The input parameters selected in the Export button dialog for this example are shown in Figure 62. (For large studies containing multiple flow cells, it is not recommended to select all files for download, as depicted in the figure. Bin the exports or reduce the number of flow cells in the study.) Figure 62: Input parameters for the exported data shown in subsequent figures Figure 63 shows the second-level directory tree of the export folder for an RNA flow cell named "6K_TMA". The TileDB folder is the exported TileDB array. The tertiary analysis objects (Seurat objects) are in the root folder. One Seurat object is exported for each flow cell in the study. Figure 63: Second level directory tree of export folder of an RNA study. See subsequent figures to expand the DecodedFiles and flatFiles folders. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 81 MAN-10162-11 CosMx SMI Data Analysis User Manual Export Images and Data [Page 82] Figure64expandsthedirectorytothefourthleveloftheDecodedFilesfolder. l TheAnalysisResultsandAnalysisResultsArchivedfolderscontainlogfilesperFOVoftargetcallandrunlogs. l TheCellCompositefoldercontains5-channelcompositeimagesperFOV.CellCompositeimagesareavailablein ControlCenterv1.4.1/ AtoMxv1.3.2.4andlater,butnotinthepreviousControl CenterorAtoMxversions.For flowcellsrunonpreviousversions,CellCompositeimagescanbecreatedfromMorphology2Dimages(resource onGithub). l TheCellOverlayfoldercontainsgreyscaleimagesofeachFOVwithcellsegmentationoverlayedontheimage. l TheFOV#folderscontaincellandcompartmentlabelimagefilesdisplayingpixelintensityvaluesforcell_ID’sand segmentationcompartmentsandmean/maxintensityvaluesforeachmorphologymarkerandcellID. l TheMorphology2Dfoldercontains5-channellayeredtifimagesofeachFOV.Thesefilescanbeusedtocreate CellCompositeimages(resourceonGithub)(seeCellCompositebullet,above). l TheRnDfoldercontainsrunsummarystatisticsof membraneandnucleisignal,membranesegment,cell coverage,theaveragecellareaandnumberofcellsforeachFOV,ascsvfiles. l TheSegmentationfoldercontentsareidenticaltotheoutputstructureoftheCellOverlayfolder,FOV#folder,and RnDfolder.Thegreyscaleimages,cellandcompartmentlabelsandrunsummaryarespecifictotheappliedcell segmentationprofile. l TheRunSummaryfoldercontainstheexperimentalconfigurationparametersoftherunandthespatialmetrics (cycle/reporternumber,fiducialintensityandbackground,UVcleavageefficiency,etc.).Theimagingshading profileisstoredintheShadingfolder.TheFovTrackingfoldercontainsfilesmappingtheFOVpositiontostage position,andtheQCDirincludestherunsummarystatusforeachFOVincludingregistrationstatus,channel intensity,andspotqualityascsvfiles. l TheLogsfoldercontainslogfilesoftheimagingrun. Figure65showsthesecondleveldirectorytreeoftheflatFilesfolder.Thefilecontainsallflatfilesfordownstream dataanalyses,assetintheinputparametersoftheExportbutton. 82 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual ExportImagesandData MAN-10162-11 [Page 83] Figure 64: Fourth level directory tree of DecodedFiles folder of an RNA study. Figure 65: Second level directory tree of the flatFiles folder of an RNA study. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 83 MAN-10162-11 CosMx SMI Data Analysis User Manual Export Images and Data [Page 84] Figure 66 shows the second-level directory tree of the export folder for a protein study named "NanoString_protein" and flow cell named "Protein". The TileDB folder is the exported TileDB array. The tertiary analysis objects (Seurat objects) are in the root folder. One Seurat object is exported for each flow cell in the study. Figure 66: Second level directory tree of export folder of a protein study. See subsequent figures to expand the DecodedFiles and flatFiles folders. Figure 67 shows the fourth level directory tree of the DecodedFiles folder for a protein flow cell. l The AnalysisResults and AnalysisResultsArchived folders contain per channel stats for each FOV and run logs. l The CellComposite folder contains 5-channel composite images per FOV. CellComposite images are available in Control Center v1.4.1 / AtoMx v1.3.2.4 and later, but not in the previous Control Center or AtoMx versions. For flow cells run on previous versions, CellComposite images can be created from Morphology2D images (resource on Github). l The FOV# folders within CellStatsDir contains cell and compartment label image files displaying pixel intensity values for cell_ID’s and segmentation compartments and mean/max intensity values for each morphology marker and cell ID. l The CellOverlay folder within CellStatsDir holds greyscale images of each FOV with cell segmentation overlayed on the image. l The Morphology 2D folder contains 5-channel layered tif images of each FOV. These files can be used to create CellComposite images (resource on Github) (see CellComposite bullet, above). l The RnD folder contains run summary csv files for each FOV, including the percentage of cells with membrane and nuclei signal, the average membrane segment, cell coverage, average cell area and number of cells. l The Segmentation folder contents are identical to the output structure of the CellOverlay folder, FOV# folder, and RnD folder. The greyscale images, cell and compartment labels and run summary are specific to the applied cell segmentation profile. l The RunSummary folder contains the experimental configuration parameters of the run and the spatial metrics (cycle/reporter number, focus and x, y, z position for each channel, UV cleavage efficiency, etc.). The distortion and imaging shading profiles are stored in the Distortion and Shading folders, respectively. The FovTracking folder contains files mapping the FOV position to stage position. l The Logs folder contains log files of the imaging run. 84 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Export Images and Data MAN-10162-11 [Page 85] Figure 67: Fourth level directory tree of DecodedFiles folder of a protein study. Figure 68: Second level directory tree of AnalysisResults folder Figure 68 shows the second-level directory tree of the AnalysisResults folder. The PerCellStats folder contains cell statistics for all channels and for each protein target in csv format. The ProteinImages folder contains protein image files for each protein in 16 bit tiff format displaying protein expression. The ProteinMasks folder contains mask files for each protein showing the target area. Figure 69 shows the second level directory tree of the flatFiles folder of a protein study. The file contains all flat files for downstream data analyses, as set in the input parameters of the Export button. See note about flat file compression in the box on page 77. Figure 69: Second level directory tree of the flatFiles folder of a protein study. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 85 MAN-10162-11 CosMx SMI Data Analysis User Manual Export Images and Data [Page 86] Appendix I:  Literature References These references provide additional information on the modules of the CosMx SMI Data Analysis Suite. Cell Segmentation https://github.com/MouseLand/cellpose https://cellpose.readthedocs.io/en/latest/# Quality Control https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h1.htm Normalization - RNA https://scanpy-tutorials.readthedocs.io/en/latest/tutorial_pearson_ residuals.html https://genomebiology.biomedcentral.com/articles/10.1186/s13059- 021-02451-7 https://satijalab.org/seurat/reference/normalizedata UMAP https://pubmed.ncbi.nlm.nih.gov/30531897/ Cell Typing (InSituType) https://www.biorxiv.org/content/10.1101/2022.10.19.512902v1.full Expression Model and Cell Typing (CELESTA) https://doi.org/10.1038/s41592-022-01498-z Neighborhood Analysis https://pubmed.ncbi.nlm.nih.gov/32763154/ https://pubmed.ncbi.nlm.nih.gov/27818791/ Leiden Clustering https://www.nature.com/articles/s41598- 019-41695-z Spatial Expression Analysis https://link.springer.com/article/10.1007/s101090100064 Differential Expression https://github.com/glmmTMB/glmmTMB https://github.com/rvlenth/emmeans https://www.nature.com/articles/s42003- 021-02146-6 Cell Type Co-Localization https://www.jstor.org/stable/2984796 https://book.spatstat.org/ Signaling Pathways https://pubmed.ncbi.nlm.nih.gov/28991892/ https://github.com/aertslab/AUCell https://bioconductor.org/packages/release/bioc/html/AUCell.html Pathway Analysis https://doi.org/10.1038/nmeth.4463 Novae https://www.nature.com/articles/s41592- 025-02899-6 86 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Appendix I:  Literature References MAN-10162-11 [Page 87] AppendixII:Createa SignatureMatrixandTuningParameterFilefor CellTyping(CELESTA) RefertoCellTyping(CELESTA)-Proteinonpage59forinformationaboutthepurposeofthesignaturematrixand tuningparameterfiles.Tocreateacustomsignaturematrix, 1. Download the default signature matrix (human or mouse) from Github/Nanostring- Biostats/CelestaSignatureLibrarytouseasatemplate(fromthematrixfile'spageinGitHub,clickRawtoopen therawdatainthebrowser.Right-clickandselectSaveAs...tosaveasa.csvfiletoyourcomputer(Figure70)). Figure70:Click'Raw'thenright-click,SaveAs...tosavethedefaultsignaturematrixfromthe CELESTASignatureLibraryinGitHub. 2. DonoteditthecontentsofcellA1orB1.EditColumnA'srowstolistthecelltypenamestobeassignedthrough celltyping.EditColumnB'srowstoindicatethelineagelevelofeachcelltype,usingtheformat:ClusteringLevel_ CellTypeNumberDescendedFrom_OverallCellTypeNumber(Figure71). Figure71:Creatingacustomsignaturematrix-lineagelevels. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 87 MAN-10162-11 CosMxSMI DataAnalysisUserManual AppendixII:CreatingaCustomSignatureMatrix [Page 88] 3. Editthecolumnheaders(startingatColumnC)tomatchthenamesofthemarkerswhichwilldefinethecell typing.TheentireCosMxSMI paneldoesnotneedtobeincluded.Onlythemarkerslistedwillbeusedforcell typing. 4. Fillinthematrixtoreflectthemarkerexpressionineachofthecelltypes(Figure72).Avalueof1indicatesthe proteinshouldbeexpressedinthecelltype;0indicatesitshouldnotbeexpressed.Blank(orNA,ifreadintoR) indicatestheproteinisnotconsideredinthescoringfunction. Figure72:Creatingacustomsignaturematrix-expressionvalues. 5. Savethesignaturematrixfile,thenuploadit totheExpression(CELESTA)orCellTyping(CELESTA)module parameters. Tocreateacustomtuningparameterfile, 1. DownloadthedefaulttuningparameterfilefromGithub/Nanostring-Biostats/CelestaSignatureLibrarytouseasa template(seedownloadinstructionsabove). 2. Editthetuningparameterfiletomatchthesignaturematrixitwillberunwith:thesamenumberofrowsandthe samecelltypenamesinColumn1. 3. Next,fillinthetuningparameters(Figure73)asdescribedbelow.It'srecommendedtotune1-2markersata time(editthefile,runCellTyping(CELESTA),evaluateresults,editthefileagain,andre-run,asneeded). 88 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual AppendixII:CreatingaCustomSignatureMatrix MAN-10162-11 [Page 89] Figure73:Creatingacustomtuningparameterfile. • Adjustthehighexpressionthresholdvaluesforanchorcells(ColumnB):ifacell’sprobabilityofexpressingthe canonicalmarker(s)forthiscelltype(definedinthesignaturematrix)isgreaterthanthisthreshold,thecellcan becomean“ anchorcell"intheCELESTAalgorithm(seeCellTyping(CELESTA)-Proteinonpage59).Increasing thevaluemakesthecelltypingcallsmorestringent(cellsmusthavehigherexpressionofthemarker(s)namedin thesignaturematrixtobedesignatedananchorcell). • Adjustthehighexpressionthresholdvaluesforindexcells(ColumnD):acell'sprobabilityofexpressingthe canonicalmarker(s)forthiscelltype(definedinthesignaturematrix)mustbegreaterthanthisthresholdtobe assignedthecelltype.Increasingthevaluemakesthecelltypingcallsmorestringent(cellsmusthavehigher expressionofthemarker(s)namedinthesignaturematrixtobedesignatedasthatcelltype). • Thelowexpressionthresholdvaluesaregenerallyrobustanddonotrequiretuning.Theyspecifythatacellmay beconsideredaparticularcelltypeaslongasthemarkersthatshouldNOTbeexpressedinthatcelltypescore< 0.9/1.Forexample,acellmaybetypedasanimmunecellaslongasthenon-immunemarkers(asdefinedinthe signaturematrix)arenotveryhighlyexpressed.If thenon-immunemarkersarehighlyexpressed,thereisnot justificationtocallthecellanimmunecell. 4. Savethetuningparameterfile,thenuploadittotheCellTyping(CELESTA)moduleparameters. If theCellTyping(CELESTA)modulefailsinthepipeline,thereasoncouldbethatthemarkernamesinthe signaturematrixdonotexactlymatchthemarkernamesasencodedinthesoftware.Openthemodulerunlog filebyclickingonthemetricsicononthemoduleblockinthePipelineStructurepanel,andcheckfortheerror message"...markersnotfoundinmarker_expr_matrix."If thiserroroccured,themarkerspecifiedintheerror messagehasa differentnamein theCosMxSMI software.Checkthedropdowngenelistsin theCosMx SMI DataAnalysisSuitetoseewhatnamethesoftwareusesforthemarker,andchangetheappropriatecolumn headerinthesignaturematrixtomatch.Then,re-runthepipelinestep. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 89 MAN-10162-11 CosMxSMI DataAnalysisUserManual AppendixII:CreatingaCustomSignatureMatrix [Page 90] Appendix III: Setup to Export Data to an AWS S3 Bucket The following steps are provided to supplement AWS user documentation. Refer to Getting Started with Amazon S3 for more comprehensive instructions and technical support and/or or speak with your institution's Informatics or IT team. AWS Free S3 5 GB plan will suffice for the transfer and storage of most studies (excluding decoded data files). (Transfer of larger files is permitted with this plan, but will incur a modest cost.) Plan options may change; please refer to AWS user documentation to decide the best plan for your needs. Figure 74: Sign in as root user 1. Create an AWS account. 2. Sign in as a Root user (Figure 74). 3. Click on Services in the top left of the screen, then click Storage, then S3 (Figure 75) . Figure 75: Storage window: S3 90 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Appendix III: Setup to Export Data to an AWS S3 Bucket MAN-10162-11 [Page 91] 4. Click Create bucket (Figure 76; your view may be different). Figure 76: Create bucket button 5. Fill in Bucket name – do not use spaces, uppercase or special characters (Figure 77). Choose the AWS region that matches the AtoMx SIP AWS (if known) (e.g. US East, EU, etc). Leave all other options on this page as the default values. ClickCreate bucket. Figure 77: Fill in bucket name and choose AWS region 6. Return to Buckets and note the AWS region associated with your bucket ( Figure 78). Click on the name of your newly created bucket to access it. Figure 78: Buckets window 7. Click Create folder to make a new destination folder for CosMx SMI data export (Figure 79). FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 91 MAN-10162-11 CosMx SMI Data Analysis User Manual Appendix III: Setup to Export Data to an AWS S3 Bucket [Page 92] Figure79:Createfolderbutton 8. Fillinthefoldername(donotusespacesorspecialcharacters).SelectEncryptionkeytype:AmazonS3 managedkeys.ClickCreatefolder(Figure80). Figure80:Enterafoldername 9. ChecktheboxtotheleftofyournewlycreatedfolderandclickCopyS3URI(theURIshouldbeintheformat: s3://atomxtest/s3demo/)(Figure81).ThisisthedestinationS3filepathwhichyouwillinputtotheCosMx SMI ExportDatasetdialoginAtoMxSIP. 92 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual AppendixIII: SetuptoExportDatatoanAWS S3Bucket MAN-10162-11 [Page 93] Figure81:CopyS3URIbutton 10. Next,youwillgenerateaccesskeysto accessthisS3bucket.ClickonServices, thenSecurity,Identity, & Compliance,thenselectIAM(Figure82). Figure82:IAMsettings FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 93 MAN-10162-11 CosMxSMI DataAnalysisUserManual AppendixIII: SetuptoExportDatatoanAWS S3Bucket [Page 94] 11. SelectUsersfromtheleftmenu(Figure83). Figure83:SelectUsers 12. ClickAddusers(Figure84). Figure84:Addusersbutton 13. IntheUserdetailswindow,specifyausername(Figure85).ClickNext. Figure85:Specifyanewusername 14. IntheSetpermissionswindow,selectPermissionoptions: Adduserto group, thenclickCreategroup (Figure86). 94 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual AppendixIII: SetuptoExportDatatoanAWS S3Bucket MAN-10162-11 [Page 95] Figure86:Setpermissionswindow 15. InthewindowCreateusergroup, createausergroupnamewithoutspaces(Figure87).UnderPermissions policies,searchforS3thenchecktheboxforAmazonS3FullAccess.ClickCreateusergroup. Figure87:Createusergroupwindow 16. YoushouldseetheUsergroupyou'vejustcreatedinthelistUsergroups(Figure88).Checktheboxtotheleft ofthename,thenclickNext. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 95 MAN-10162-11 CosMxSMI DataAnalysisUserManual AppendixIII: SetuptoExportDatatoanAWS S3Bucket [Page 96] Figure88:Checktheboxnexttotheusergroupname 17. IntheReviewandcreatewindow,confirmthatthegroupnameislistedinthePermissionsSummary(Figure 89).ClickCreateUser. Figure89:Createuser-permissionssummary 18. Apop-upmessageindicatesthattheuseriscreatedsuccessfully(Figure90).ClickViewuser. Figure90:Viewuserbutton 19. ClickonthetabSecuritycredentials(Figure91). 96 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual AppendixIII: SetuptoExportDatatoanAWS S3Bucket MAN-10162-11 [Page 97] Figure 91: Security credentials from username menu 20. Scroll down to Access keysand clickCreate access key(Figure 92). Figure 92: Create access key button 1 21. You may be presented with a list of alternatives to access keys (Figure 93). SelectOther and clickNext. FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 97 MAN-10162-11 CosMx SMI Data Analysis User Manual Appendix III: Setup to Export Data to an AWS S3 Bucket Figure 93: Alternatives to access keys [Page 98] 22. Optionally enter a description tag for the key, then clickCreate access key(Figure 94). Figure 94: Optional description for access key 23. Copy the access key and secret access key by clicking each copy icon ( Figure 95) and pasting them in a safe place. ClickDownload .csv file for additional security backup to document your S3 access key and secret access key. Figure 95: Retrieve access key credentials 24. The destination S3 file path (or S3 URI), access key, secret key, and region are then entered into the fields of the export dataset dialog (refer to instructionson page 77). The export can be configured with session tokens as well, although please note that large exports may exceed the 12-hour limit of AWS session tokens. 98 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. CosMx SMI Data Analysis User Manual Appendix III: Setup to Export Data to an AWS S3 Bucket MAN-10162-11 [Page 99] AppendixIV:DownloadCosMxSMIFilesFromS3BucketAfterExport NOTE:It maybe preferableto manageyourdecodeddatain AWSratherthandownloadit to a local environment.DownloadingfromS3(fileegress)incurscostsaccordingtoAWSpricingstructure.Pleasereferto AWSdocumentationforadditionalsupportmanagingyourdecodeddatainAWS. FilescanbedownloadedusingtheS3console;foldersmustbedownloadedusingcommandlineinterface(CLI). ThefollowingstepsmaybeperformedusingAWSCLI(preferred)orAnacondaPrompt. IMPORTANT: Beforestarting,ensurethatthelocalenvironmenthasenoughfreespacetoaccommodate thefiles. 1. InstalltheAWSCLIscriptpackageonyourcomputer(or,ifusingAnacondaPrompt,installtheAnacondaClient). a. NavigatetotheAWSCommandLineInterfacewebsiteanddownloadtheappropriateinstallerfromthelinks ontheright(Figure96). b. FollowthepromptstoinstallCLItoyourcomputer. Figure96:DownloadAWSCLIInstaller 2. Onceinstallisfinished,openCommandPromptonWindowsbypressingtheWindowskeyonyourkeyboardand typing‘command’(Figure97).ClicktheCommandPrompttoopen. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 99 MAN-10162-11 CosMxSMI DataAnalysisUserManual AppendixIV:DownloadCosMxSMIFilesAfterExport [Page 100] Figure97:OpenCommandPrompt 3. Typeinawsconfigure(Figure98). a. WhenaskedtoprovidetheAccessKey,enterthespecifickeyfortheIAMuser. b. WhenaskedtoprovidetheSecretKey,enterthespecifickeyfortheIAMuser. c. Setasregion:entertheregionoftheS3bucket,suchasus-west-1oreu-central-1.Thisinformationcan befoundinthemainfolderofyourS3bucket. d. Setasoutputformat:text Figure98:InputforAWS Configurecommand 4. Typeinawss3 lstocheckwhetherthebucketisconfiguredcorrectly.Theoutputshouldshowthenameof yourS3bucket. 5. Withinthecommandprompt,navigatetothefoldertowhichtodownloadthedecodeddata.Forexample,ifthe targetfolderis ‘testdownload’onyourC:drive,typecd testdownload(Figure99).Thecommandcd.. navigatesbackonestepintheC:drivefolderhierarchy. Figure99:Navigatetotargetfolder 6.WithinyourAWSS3account,navigatetothefoldertodownloadandselectit withacheckmark(Figure100). ClickonCopyS3URItocopythefolderlocation. 100 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual AppendixIV:DownloadCosMxSMIFilesAfterExport MAN-10162-11 [Page 101] Figure100:CopyS3folderURI 7.Inthecommandprompt,typeawss3 syncthenpastetheURIcopiedfromtheS3folderinStep7.Following thepastedURI,typeaspaceandthenperiod..(Figure101). Figure101:SyncCLItoS3folder 8.ThedownloadwillbeginonceyoupressEnter.Thedurationofdownloadtimedependsonthesizeofthefiles. Fortechnicalsupport,pleasecontactyourFieldApplicationsScientistorsupport.spatial@ bruker.com. FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 101 MAN-10162-11 CosMxSMI DataAnalysisUserManual AppendixIV:DownloadCosMxSMIFilesAfterExport [Page 102] TroubleshootingandTechnicalSupport Foradditionalsupport,contactsupport.spatial@ bruker.com. Issue Possible Cause SuggestedActions "Somethingwentwrong", "Studylocked",or"SMIDA failedtoauthorize"message Session timeout LogoutofAtoMxSIPandloginagain. Studycreationfailed Processing issue Evaluatelogfilesforstudycreation.ClickonDetailsandlogsinthe StudyDetailspanel(Figure102).Logfilesaredownloadedaszipped filestoyourlocalDownloadsfolder.Onceunzipped,theycanbe openedinatexteditorsuchasNotepad. Figure102:DetailsandlogslinkintheStudyDetailspanel. Apipelinemodulefails Processing issue Evaluatelogfilesforanindividualmodulebyclickingthemetricsicon onthemoduleblockinthePipelineStructurepanel,thenclickthe downloadicon(Figure103).Logfilesaredownloadedaszippedfiles toyourlocalDownloadsfolder.Onceunzipped,theycanbeopened inatexteditorsuchasNotepad. Figure103:Themetricsiconopensmodulemetrics andtheoptiontodownloadlogfiles Pipelinemodulefailswith error"`[.data.frame`(obs,, tiledb_stored_names): undefinedcolumnsselected" Pipelinerun namebegins witha number insteadofa letter Renamethepipelinerunstartingwithaletterinsteadofanumber. Seeinstructiononpage 37. Novaemodulefails Insufficient threadsor CPUcores; hubtraffic Themodulemaysucceedonaretry.Trybreakingupthestudyinto smallerunitsforanalysis.Contactsupport.spatial@ bruker.comifthe problempersists. Table9:CosMxSMI DataAnalysistroubleshooting 102 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. CosMxSMI DataAnalysisUserManual Troubleshooting MAN-10162-11 [Page 103] Issue Possible Cause Suggested Actions "Failed to rerun pipeline step [object Object]" message Module error May occur if re-running a custom module that directly follows Initial Data. Please create and run a new pipeline instead of re-running the module. Image export results in an error Processing issue Wait 30 minutes and retry image export. If issue persists, contact support.spatial@ bruker.com . JPG image export fails with error "unable to write to target" Known issue related to JPG export Uncheck the box to include a scalebar when selecting the objects to export. Retry image export. Export fails with error "Pod terminated preemptively" (or "ExpiredToken" in logs) Export exceeded time limit of session Break up the export into smaller jobs by reducing the number of flow cells in the study or selecting fewer files to export from the Export dataset dialog (Figure 58 on page 77). Exporting Seurat object yields error "invalid class Seurat object" or "all cells in reductions must be in the same order as the Seurat object" Known issue related to modules PCA, UMAP, Nearest Neighbor This issue will be addressed in an upcoming software release. In the meantime, please work around the issue by exporting the flat files and then using the Seurat package to create the Seurat object. Difficulty with immune cell typing Immune cells may pose particular challenges in cell typing Use HieraType, a hierarchical cell typing method optimized for detection and subclustering of immune cells (described in CosMx Scratch Space). Image shows black circles or holes (Figure 104) Fiducials were not prepared or applied properly Contact support.spatial@ bruker.comto further troubleshoot. Figure 104: Appearance of holes on tissue FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 103 MAN-10162-11 CosMx SMI Data Analysis User Manual Troubleshooting [Page 104] 3350 Monte Villa Parkway Bothell, WA 98021 brukerspatialbiology.com Sales ContactsContact Us FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. North America: nasales.bsb@bruker.com EMEA: emeasales.bsb@bruker.com APAC: apacsales.bsb@bruker.com All other regions: globalsales.bsb@bruker.com Tel: +1 888.358.6266 Fax: +1 206.378.6288 Technical Support support.spatial@bruker.com Customer Service customerservice.spatial@bruker.com Bruker Spatial Biology, Inc.

中繼資料

文件編號
3
切片數量
96
來源 URI
/Users/axichang/Library/Mobile Documents/M28Q9HQMUK~com~mobisystems~OfficeSuite/Documents/冷泉港/產品資訊/nanoString/SMI/MAN-10162-11_CosMx_SMI_Data_Analysis_User_Manual_for_v2.2.pdf