Knowledge Hub Console
知識詳情
只有在匯入完成後,文件記錄才會出現在這裡。上傳工作階段、原始檔與解析後知識條目是刻意分離的概念。
使用中pdf_manualknowledge-hub-cosmx-smi-v2-2
MAN-10162-11_CosMx_SMI_Data_Analysis_User_Manual_for_v2.2.pdf
更新於 2026-03-30T15:34:42.048Z · 上傳者 codex_seed
內容
[Page 1]
MANUAL
CosMx® SMI
Data Analysis
MAN-10162-11
For AtoMx® SIP v2.2
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
Innovation with Integrity
[Page 2]
Contacts
BrukerSpatialBiology,Inc.
3350MonteVillaParkway
Bothell,Washington,USA98021
www.brukerspatialbiology.com
Tel:888-358-6266
Fax:206-378-6288
TechnicalSupport:support.spatial@bruker.com
CustomerService:customerservice.spatial@bruker.com
SalesContacts
NorthAmerica:nasales.bsb@bruker.com
EMEA:emeasales.bsb@bruker.com
APAC:apacsales.bsb@bruker.com
Allotherregions:globalsales.bsb@bruker.com
LuxendoGmbH
ImBreitspiel2-4
69126Heidelberg
Germany
UK Rep
NanoStringTechnologies,EuropeLimited
Suite2FirstFloor,10TempleBack
Bristol,UnitedKingdomBS16FL
bnano.legal@bruker.com
[Page 3]
Rights,License,andTrademarks
Use
ForResearchUseOnly.Notforuseindiagnosticprocedures.
IntellectualPropertyRights
ThisCosMx®SpatialMolecularImager(SMI)UserManualanditscontentsarethepropertyofBrukerCorporation(“ Bruker” ),
andareintendedsolelyforusebyBrukercustomers,forthepurposeofoperatingtheCosMxSMISystem.TheCosMxSMI
System(includingbothitssoftwareandhardwarecomponents),thisUserManual,andanyotherdocumentationprovidedtoyou
byBrukerinconnectiontherewith,aresubjecttopatents,copyright,tradesecretrightsandotherintellectualpropertyrights
ownedby,orlicensedto,Bruker.Nopartofthesoftwareorhardwaremaybereproduced,transmitted,transcribed,storedina
retrievalsystem,ortranslatedintootherlanguageswithoutthepriorwrittenconsentof Bruker.Fora listof patents,see
www.nanostring.com/company/patents.
LimitedLicense
SubjecttothetermsandconditionsoftheCosMxSMISystemcontainedintheproductquotation,Brukergrantsyoualimited,
non-exclusive,non-transferable,non-sublicensable,researchuseonlylicensetousetheproprietaryCosMxSMISystemonlyin
accordancewiththemanualandotherwritteninstructionsprovidedbyBruker.Exceptasexpresslysetforthinthetermsand
conditions,norightorlicense,whetherexpress,impliedorstatutory,isgrantedbyBrukerunderanyintellectualpropertyright
ownedby,orlicensedto,BrukerbyvirtueofthesupplyoftheproprietaryCosMxSMISystem.Withoutlimitingtheforegoing,no
rightorlicense,whetherexpress,impliedorstatutory,isgrantedbyBrukertousetheCosMxSMISystemwithanythirdparty
productnotsuppliedorlicensedtoyoubyBrukerorrecommendedforusebyBrukerinamanualorotherwritteninstruction
providedbyBruker.
Trademarks
BrukerSpatialBiology,theNanoStringlogo,CosMxandAtoMxaretrademarksorregisteredtrademarksofBrukerCorporationin
theUnitedStatesand/orothercountries.Allothertrademarksand/orservicemarksnotownedbyBrukerCorporationthatappear
inthisdocumentarethepropertyoftheirrespectiveowners.
OpenSourceSoftwareLicenses
Visithttp://nanostring.com/cosmx-ossandhttp://nanostring.com/atomx-ossforalistofopensourcesoftwarelicensesusedin
CosMxSpatialMolecularImaging.
Copyright
©2025BrukerCorporation.Allrightsreserved.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 3
MAN-10162-11 CosMxSMI DataAnalysisUserManual
Rights,License,andTrademarks
[Page 4]
TableofContents
CosMxSMIDataAnalysisUserManual 1
Rights,License,andTrademarks 3
TableofContents 4
ChangesinthisRevision 7
Changesinv2.2AffectingCustomer-BuiltAnalysisPipelines(afterAtoMxDataExport) 7
Conventions&Safety 8
CosMxSMI WorkflowOverview 9
CosMxSMIUserManualsandResources 10
CosMxSMIDataisManagedintheAtoMxSpatialInformaticsPlatform 11
CellSegmentation 14
CellSegmentationWorkflow 15
CellSegmentationConfigurations 20
ExamplesofAdjustmentstoParameters 22
CreateandOpenaStudy 26
DeleteaStudy 27
OrientationtoCosMxSMI DataAnalysisSuite 28
ManageAnnotations 36
RunaPipeline 37
HowtoRunDataAnalysisPipelinesThatRequireParameterInputs 40
CustomModules 41
RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data 43
CosMxSMIPipelineModules 46
InitialData 46
QualityControl-RNA 47
QualityControl-Protein 50
GeneSelection-RNA 51
Normalization-RNA 51
Normalization-Protein 53
PrincipalComponentAnalysis(PCA)-RNA orProtein 54
UMAP-RNAorProtein 54
CellTyping(InSituType)-RNA 56
4 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
TableofContents
MAN-10162-11
[Page 5]
ExpressionModel(CELESTA)-Protein 59
CellTyping(CELESTA)-Protein 59
CellTypeQC-RNA 60
NeighborNetwork:ExpressionSpace-RNAorProtein 61
LeidenClustering-RNA orProtein 61
IdentifyMarkerGenes-RNAorProtein 62
NeighborhoodAnalysis-RNA orProtein 63
Ligand-Receptor(LR)Analysis-RNA 63
SpatialNetwork-RNA orProtein 64
CellTypeCo-Localization-RNAorProtein 64
PathwayAnalysis-RNA 65
SpatialExpressionAnalysis-RNA 65
DifferentialExpression(DE)-RNA 66
Novae-RNA(SpatialDiscoverystudiesonly) 68
SpatialDiscovery-RNA(SpatialDiscoverystudiesonly) 69
CosMxSMIDataVisualizations 70
StudyStatisticsTable 70
QC MetricsTable 70
XYPlot 71
Heatmap 71
BoxPlotandViolinPlot 72
Histogram 72
PCA Plot 73
UMAPPlot 73
VolcanoPlot 74
FlightpathPlot 74
SaveaVisualization 75
ExportImages 76
ExportData 77
WorkingwithExportedData 81
AppendixI: LiteratureReferences 86
AppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA) 87
AppendixIII: SetuptoExportDatatoanAWS S3Bucket 90
AppendixIV:DownloadCosMxSMIFilesFromS3BucketAfterExport 99
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 5
MAN-10162-11 CosMxSMI DataAnalysisUserManual
TableofContents
[Page 6]
TroubleshootingandTechnicalSupport 102
Finalpage 104
6 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
TableofContents
MAN-10162-11
[Page 7]
ChangesinthisRevision
ThisusermanualhasbeenupdatedtoreflectnewfeaturesofCosMxSMI DataAnalysisversion2.2:
l TransferflowcellownershiptootherAtoMxSIPaccountholders,describedonpage 12
l Openaflowcellimageinanewtabofthebrowserwindow,describedonpage 12
l Updatedcellsegmentationusecasesformuscle(cross-sections,longitudinalsections,andheart)andexamples
forresegmentation,beginningonpage 20
l AddedinformationaboutanexperimentalnewSpatialDiscoverystudyconfiguration(page26),visualization-first
featuresanddownloadpackageforLLManalysis(pages28-29),anddataanalysispipeline(page37)
l UpdatedtheRecommendationsforAnalyzingWholeTranscriptome (WTX) Datawithmoredetailoncalculating
SNRandincorporatingtheGeneSelectionmodule,onpage 43
l Addedinformationaboutnewdataanalysismodulesin theSpatialDiscoverypipeline,NovaeandSpatial
Discovery,onpage 68
l CorrecteddescriptionofFOV Positionsflatfile(coordinatesrefertotopleftofFOVratherthancenter)onpage 78
l AddedtoTroubleshootingsectiononpage 102
l Minoreditsforclarification
Changesinv2.2AffectingCustomer-BuiltAnalysisPipelines(after
AtoMxDataExport)
Pleasebeawareofthesechangesinv2.2thatmayrequireeditstolocal,customer-builtanalysispipelines:
l Cellmetadatacolumnswillalwayscontain"."insteadof"-"regardlessofthemodulesrunpriortoexport.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 7
MAN-10162-11 CosMxSMI DataAnalysisUserManual
ChangesinthisRevision
[Page 8]
Conventions&Safety
Thefollowingconventionsareusedthroughoutthismanualandaredescribedforyourreference.
Boldtextistypicallyusedtohighlightaspecificbutton,keystroke,ormenuoption.Itmayalsobeusedtohighlight
importanttextorterms.
Blueunderlinedtextistypicallyusedtohighlightlinksand/orreferencestoothersectionsofthemanual.Itmayalso
beusedtohighlightreferencestoothermanualsand/orinstructionalmaterial.
Thegrayboxindicatesgeneralinformationthatmaybeusefulforimprovingassayperformance.Thenotesmay
clarifyotherinstructionsorprovideguidancetoimprovetheefficiencyoftheassayworkflow.
WARNING: Thissymbolindicatesthepotentialfor bodilyinjuryor damageto theinstrumentif the
instructionsarenotfollowedcorrectly.Alwayscarefullyreadandfollowtheinstructionsaccompaniedby
thissymboltoavoidpotentialhazards.
IMPORTANT: Thissymbolindicatesimportantinformationthatiscriticaltoensureasuccessfulassay.
Followingtheseinstructionsmayhelpimprovethequalityofyourdata.
Safety
WARNING: ReadtheSafetyDataSheets(SDS)andfollowthehandlinginstructions.Wearappropriate
protectiveeyewear,clothing,andgloves.SDSareavailablefromhttps://nanostring.com/resources/safety-
data-sheets.
8 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
Conventions&Safety
MAN-10162-11
[Page 9]
CosMxSMI WorkflowOverview
Figure1:CosMxSMI workflowoverview
Day1: SlidePreparation.PrepareslidesmanuallyorusingtheBOND RX/RXmfullyautomatedIHC/ISH stainer
fromLeicaBiosystems(BOND RX/RXm).
Day2: ProcessSlidesonCosMxSMI.Completeassayandassembletheflowcells.Loadassembledflowcells
intotheCosMxSMI instrumentandenterflowcell/studyinformation.TissueisscannedtocaptureRNA orProtein
readoutandmorphologyimagingwithinuser-designatedfieldsofview(FOVs).
Afterruncompletion:CreateaDataAnalysisstudyintheAtoMx®SpatialInformaticsPlatform(SIP)andperform
quality-controlchecks,dataanalysis,andgenerateanalysisplots.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 9
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMI WorkflowOverview
[Page 10]
CosMxSMIUserManualsandResources
TheCosMxSMI workflowisdividedintothefollowingusermanuals:
Step RNA Protein Multiomics
Prepare
Slides
CosMx
SMI ManualSlide
Preparation
forRNAAssays
MAN-10184
CosMxSMISemi-
AutomatedSlide
Preparation
forRNAAssays
MAN-10186
CosMxSMI Manual
SlidePreparation
forProteinAssays
MAN-10185
CosMxSMISemi-
AutomatedSlide
Preparation
forProteinAssays
MAN-10187
CosMxSMIManual
SlidePreparationfor
MultiomicAssays
MAN-10201
Processon
CosMxSMI
CosMxSMI InstrumentUserManual
MAN-10161
Data
Analysis
CosMxSMI DataAnalysisUserManual
MAN-10162
UsermanualsandotherdocumentscanbefoundonlineintheNanoUDocumentLibrary.Instrumentandworkflow
trainingcoursesarealsoavailableinNanoU.
For informationabout the AtoMx SpatialInformaticsPlatform,pleaserefer to the AtoMx SIP
Platform AdministrationManual(MAN-10170).
AdditionaldataanalysissupportandresourcescanbefoundintheCosMxAnalysisScratchSpace.
10 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIUserManualsandResources
MAN-10162-11
[Page 11]
CosMxSMIDataisManagedintheAtoMxSpatialInformatics
Platform
Afterprimaryprocessingon-instrument,datafromtheCosMxSpatialMolecularImager(SMI)isautomaticallysent
to thecloud-basedAtoMxSpatialInformaticsPlatform(SIP)forfurtherprocessing,storage,management,and
collaboration.CosMxSMI dataanalysisis performedwithinAtoMxSIPintheCosMxSMI DataAnalysisSuite
(Figure2).
Figure2:TheAtoMxSIP cloudplatformhousestheCosMxSMI DataAnalysisSuite
ThisCosMxSMI DataAnalysisUserManualoutlinesthestepstoanalyzedataintheDataAnalysisSuiteusingthe
pipelineorchestratorandanalysismodules.LearnmoreaboutspatialdataanalysisthroughNanoUandCosMx
AnalysisScratchSpace.DataanalysisservicesarealsoavailablethroughBruker(seeSpatialDataAnalysisService).
ConnectingCosMxSMI toAtoMxSIP
ForCosMxSMI instrumentowners,theinstrumentisregisteredtoAtoMxSIPduringinstallationorusertraining.
Onceregistered,datatransferfromtheinstrumenttoAtoMxSIPoccursautomatically.RefertotheCosMxSMI Site
PreparationGuide(MAN-10171)andCosMxSMIInstrumentUserManual(MAN-10161)foradditionalinformation
andtoaddresssituationsthatrequiremanualtransferorintervention.
AccessingAtoMxSIP
LogintoAtoMxSIPatyourorganization'scustomURL. RefertotheAtoMxSIPPlatform AdministrationManual
(MAN-10170)formoreinformationaboutaccountset-up.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 11
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIDatainAtoMxSIP
[Page 12]
OrientationtoAtoMxSIP
TheAtoMxSIPHomepageandkeyfeaturesareshownbelow(Figure3).FormoredetailsaboutAtoMxSIP(outside
oftheDataAnalysisSuite),pleaserefertotheAtoMxSIPPlatform AdministrationManual(MAN-10170).
Figure3:Homepagewithkeycomponentslabeled
Fromtheleftmenu,navigatewithinsectionsoftheplatform.PleasenotethatonlytheHomepageisvisibleto
ExternalUsers(AtoMxSIPusersoutsideanOrganizationalAccount-seetheAtoMxSIPPlatform Administration
Manual(MAN-10170)formoreinformation).
ThegalleryisaccessiblethroughtheHomepage,anddisplaysthedatacollections,studies,andCosMxSMIflow
cellsonindividualcards.Themostrecentlyaccessedobjectsaredisplayed.ClickonViewAlltoviewallobjects.
Collectionsaregroupsofdataobjects(flowcellsandstudies).Individualobjectscanbepartofmorethanone
collection.Deletingacollectiondoesnotdeletetheindividualobjectsinthecollection.
LocatespecificobjectsofinterestusingtheSortBytool(topright),theGlobalContentFilter(checkboxesatleft),
typingtermsintothesearchbar,orclickingthefiltericontoopenafilteringpane.
Clickthecaratonacardtoexpanddetailsabouttheflowcellstatus(Table1).Clickthe3 dots(
)onacardfor
optionsincludingView,Openin newtab,Addto collection,Viewcomments,Viewdetails, andMore
options(Delete).
Selectastudy(usingitscheckbox)toactivatetheSharebutton.SelectaflowcelltoactivatetheTransferbutton.
OnlyflowcellswithstatusReadyforStudyCreationcanbetransferred.Moredetailsaboutsharingstudiesand
transferringflowcellsareintheAtoMxSIP PlatformAdministrationManual(MAN-10170).
12 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIDatainAtoMxSIP
MAN-10162-11
[Page 13]
Flowcellstatus Definition
Created AflowcellrecordhasbeencreatedinAtoMxSIPandisawaitingthedatauploadprocessfrom
theCosMxSMIinstrument.
Inprogress/Data
upload
Uploadandingestionofdatainprogress.Itcantakeupto48hrafterruncompletionfordatato
finishuploadingtoAtoMx.
Inprogress/
Imageprocessing
Imagestitchinginprogressforthelayersofthedisplayimageintheviewer.Includesthe
previewscan,morphologyimage,andproteinimages(ifapplicable).
Inprogress/
Targetdecoding
Targetdecodinginprogress.
Readyforstudy
creation
Priorstepsarecomplete;flowcellisreadyforanalysis.
Inprogress/
Resegmentation
Segmentationinprogress;theflowcellcannotbeusedforanalysisuntilsegmentationis
complete.
Readyforstudy
creationwitherror
Flowcellencounteredanerrorduringsegmentation.Flowcellisreadyforanalysiswiththe
previoussegmentationresults.Re-tryingsegmentationmayresolvetheissue.
Error Anerrorindataupload,imageprocessing,ortargetdecodinghasoccurred.Clickonthe3dots,
thenViewdetailsformoreinformationabouttheerror.
Table1:Flowcellstatusdefinitions.Clickthecarattoseedetailsof"inprogress".
CosMxSMIDataIngestionBestPractices
l ForimageprocessingerrorsthatoccurduringaCosMxSMIrun,pleaseallowtheruntocompleteandperform
thepost-runcleaningbeforecontactingSupportatsupport.spatial@ bruker.com.Thisensurestheinstrumenthas
completedtheworkflowbeforeanyintervention.Itwillnotaffectdatatransferorresultinlostdata.
l Duringthepost-runcleaningonCosMxSMI,amessageindicates"Datauploadinprogress".Thesearethe
cleaninglogsbeinguploadedtoAtoMxSIP.
l Pleaseallowupto48hrafterruncompletionforflowcelldatatofinishuploadingtoAtoMxSIP.Ifflowcellstatus
checkmarks(Figure3)areincompleteoranerrorpersistsafter48hours,contactsupport.spatial@ bruker.com.
l Topreventdataloss,resolveanydataingestionissuesbeforestartinganewCosMxSMIrun.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 13
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIDatainAtoMxSIP
[Page 14]
CellSegmentation
Accuratecellsegmentationisthefoundationofmeaningfulspatialbiologydata.AnoverviewoftheAtoMxSIPcell
segmentationpipelineisshowninFigure4. Eachz-stackhas5channelimageswithdifferentmorphologymarker
signals.TheimagesareprojectedandprocessedintoonesingleimageforeachFOV.Pre-processingimprovesthe
imagequality.Themorphologymarkersignalsareprocessedby machine-learning(ML)augmentedcell
segmentationtodefinethesoma(cellbody)andnuclearlabels.Themachine-learningoutputisprocessedintofinal
celllabelstoreflectthecellsandtheircompartmentsforapplicationindownstreamanalyses.
Figure4:CellsegmentationpipelineinAtoMxSIP.
ThecellsegmentationoptionsinAtoMxSIPenableresearcherstoevaluatethesegmentationinitiallyperformedon
theirflowcellimageand(ifdesired)changetheconfigurationsettingstooptimizesegmentation.Researcherscan
alsoapplydifferentsegmentationconfigurationstodifferentFOVsacrossaflowcelltoaccommodatetissuearrays
orotherscenariosrequiringFOVcustomization.
SegmentationonAtoMxSIP(ifdesired)isperformedafterthedecodeddataistransferredfromthe
CosMxSMIinstrumenttoAtoMxSIPandbeforedataanalysisstudiesarecreated.Segmentationcanbe
performedmultipletimesin aniterativeprocessto refineandoptimizethecellboundariesfortheparticular
biologicalsampleorstudy.Segmentationprofilesarenamedandcarryversionnumbers.Onceadataanalysisstudy
iscreatedusingaparticularsetofsegmentationparameters,imagescanbesegmentedagainbyexitingthestudy
andfollowingthestepsbelow.Thenewsegmentationdatawillnotoverwritetheexistingstudy,butnewstudies
createdwillreflectthelatestsegmentationdata.
14 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CellSegmentation
MAN-10162-11
[Page 15]
CellSegmentationWorkflow
1. FromtheAtoMxSIPgallery,clickonaflowcelltoopenitintheImageViewer.Onlyflowcellswiththestatus
Readyforstudycreationareeligibleforsegmentation.
2. FromtheFOVstab,clicktheFlowCellProfileNametoviewcurrentsegmentationparametersand/orclick
SegmentCellstoopentheCellSegmentationOptionswindow(Figure5).
Figure5:CellSegmentationOptions(center)andCurrentSegmentationViewDetails(right)
3. FillinaFlowCellProfileName(namingthesetofparameterswhichwillbedefinedinthefollowingsteps,e.g.,
TonsilSegmentation4).Donotuseacommaintheflowcellprofilename.
4. FromtheFOV(s)dropdown,selecttheFOV(s)forthissegmentation.(SelectoneorafewFOVstoapplythe
segmentationinitially.Subsequently,theprofilecanbeappliedtomoreFOVsontheflowcell.)
5. FromtheConfigurationdropdown,selecttheConfigurationtoserveasthebasisforthesegmentation.See
examplesinCellSegmentationConfigurationsonpage20.
6. (Optional)AdjustparametersintheBasictab(Table2).SeeExamplesofAdjustmentstoParametersonpage22.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 15
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CellSegmentation
[Page 16]
Basic
Parameter Description
Nucleus
Diameter
(µm)
How the cell mask is adjusted to accommodate nuclei of different sizes. Increase the value from
the default to include more area; decrease to reduce the nuclear area.
Cytoplasm
Diameter
(µm)
Aids the segmentation algorithm in resizing the image before segmentation is run or re-run. Enter
a value that represents the average cell diameter in this sample or region of sample, or iterate
based on previous segmentation results (if cells appear over-split, increase cell diameter; if cells
appear over-merged, decrease cell diameter).
Final Cell
Label Min
Size
Threshold for filtering cell structures based on minimum required size.
Final Dilation
/Erosion (µm)
Adjusts the size of the cell mask, e.g. to include or exclude more transcripts that are located at the
edges (that is, on top of the cell membrane). Increase the value from the default to include more
area; decrease to reduce the included area. Dilation will automatically stop at boundaries between
cells.
Table 2: Basic parameters in cell segmentation.
7. In the Advanced tab, targets and channels can be customized (Figure 6). For example, a channel with misleading
staining (blurry, dim, or high autofluorescence) can be turned off, or an a la carte
marker's channel can be turned on.
Configurations A, C, D, E, F, G must have one channel for nuclei selected.
Configuration B must have the B and/or U channel set to nuclei; other channels
of Configuration B cannot be customized at this time. Configuration F targets
and channels cannot be customized at this time.
Other parameters in the Advanced tab can be adjusted (see Table 3); however,
most samples do not require changes to Advanced parameter default
values. The values are editable for the rare situations where adjustments to the
basic parameters cannot accurately capture the sample's cell boundaries. If
changes are needed, it is recommended to edit only Model, Probability
Threshold, and/or Flow Threshold under Cell Options or Nuclei Options. Refer
to the Examples of Adjustments to Parameters on page 22.
Figure 6 : Advanced parameters enable
custom selection of nuclei,
morphology, or other targets by
channel.
16 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Cell Segmentation
MAN-10162-11
[Page 17]
Advanced
Parameter Description
Include Mask
Processing
Check box to include the 'post-ML processing' step as depicted in Figure 4. The only situation in which you
may want to exclude mask processing is if the staining is poor or image is blurry, such that the software
lacks good data for processing.
Foreground
Threshold
A measure of the probability that a certain area belongs to a certain cell. Default 0, range -6 to 6. A more
positive number increases the stringency and includes less area. A more negative number includes more
area.
Nearest
Neighbor Split
True/False. Whether to assign a weak protrusion to the nearest neighbor cell.
Min Diam Ratio Default 0.25, meaning that final cell segments will not be smaller than 0.25* (Diameter indicated in the
selected segmentation configuration). See Configurationson
page 20
Max Diam Ratio Default 4, meaning that final cell segments will not be larger than 4* (Diameter indicated in the selected
segmentation configuration). See Configurationson
page 20
Nucleus Label
Model
Segmentation model to employ. Default is bsbNuc. For details on other models from Cellpose, see
https://cellpose.readthedocs.io/en/latest/models.html
Nucleus
Probability
Threshold
Threshold to filter out pixels of small nucleus probability that correspond to non-nuclei. When staining is
poor, lowering this value to the minimum of -6 can pick up more nuclei, but results may be more noisy.
Nucleus Flow
Threshold
Threshold used to filter out magnitude of flows that correspond to small debris/non-nuclei. When staining
is poor, lowering this value can pick up more nuclei signal, but results may be more noisy.
Cytoplasm
Label Model
Segmentation model to employ. Default is bsbCyto or bsbNeuro3Ch. For details on other models from
Cellpose, seehttps://cellpose.readthedocs.io/en/latest/models.html
.
Cytoplasm
Probability
Threshold
Threshold to filter out pixels of small cell probability that correspond to non-cells. When staining is poor,
lowering this value to the minimum of -6 can pick up more cells, but results may be more noisy.
Cytoplasm
Flow
Threshold
Threshold used to filter out magnitude of flows that correspond to small debris/non-cells. When staining is
poor, lowering this value can pick up more cells, but results may be more noisy.
Final Cell Label
Max Size
When segmenting is complete, no cell segment (area) will be larger than this value.
Prefer
Cytoplasm
Over Nucleus
True/False. TRUE will favor the cytoplasm-based segmentation over the nuclei-based segmentation calls.
Mononuclear
Cell
True/False. If one cytoplasm covers multiple nuclei, TRUE will split the cytoplasm to yield mononuclear
cells whereas FALSE will leave the cytoplasm with multiple nuclei.
Table 3: Advanced parameters in cell segmentation
8. When the parameters are set, clickApply to apply the new segmentation profile to the selected FOVs.
9. A dialog indicates that the flow cell is being segmented. Other users viewing the flow cell tile in the gallery see
the status change from Ready for study creation to Resegmenting, indicating that the flow cell is being
processed and is not available for data analysis. Segmentation takes 2-3 minutes per FOV. Any errors will be
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 17
MAN-10162-11 CosMx SMI Data Analysis User Manual
Cell Segmentation
[Page 18]
reportedinadialogonthescreen.
10. Whencomplete,thenewcellsegmentationisdisplayedintheImageViewer,withtheprofilenameandversion
listedintheFOVspanel(referbacktoFigure5)andtheImage Viewerlegend(Figure7).
Figure7:SegmentationdetailsinImageViewerlegend
11. Next,youcan:
l FurthermodifythecurrentsegmentationprofilebyadjustingparametersandclickingApply. Thenew
parameterswillbedefinedasversion2,thenversion3,etc.Toviewversiondetails(suchascreator,date
created,configuration,FOVs,andparameters),clicktheVersionlinkintheFOVspanel(seelinkAllFOVs(16)in
Figure5).
l ApplythecurrentsegmentationprofiletoadditionalFOVsintheflowcellbyselectingadditionalFOVsfrom
thedropdownandclickingApply.
l SegmentotherFOVwithdifferentconfigurations,ifdesired,byselectingthenewFOVfromthedropdown
andsettingtheconfigurationandparametersasdesired.Figure8 givesanexampleof adjacentFOVwith
differentsegmentationconfigurations.
Figure8:DifferentFOVscanusedifferentsegmentationconfigurations.
12. Oncesegmentationiscomplete,theflowcellisreadyfordataanalysis.Thesegmentationprofile(s)ineffectfor
thisflowcellaredisplayedontheCreateStudywindow(Figure9)andintheflowcellspaneloftheCosMx
SMI DataAnalysisSuite(Figure10).
18 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CellSegmentation
MAN-10162-11
[Page 19]
Figure9:SegmentationprofilesforselectedflowcellsareshownatCreateStudystep
Figure10:Segmentationprofilesapplicabletoastudy'sFOV
areshownintheFlowcellspaneloftheSMI DataAnalysisSuite
Oncetheflowcellhashadastudycreatedfromit,itcanstillbesegmented(followingthestepsabove,starting
fromthegallery).Existingstudieswillnotbeoverwritten,butnewstudieswillusethenewsegmentation
profile.WhenaflowcellisopenedintheImageViewerfromwithinadataanalysisstudy,theversionofthe
segmentationprofilethatwasusedforthisstudyisdisplayedintheFOVspanelandinthelegend.
Cellsegmentationprofilesappliedtoaflowcellaresavedintheflowcellmetadata,whichcanbeexported
throughthebuilt-in exportfunctionorthecustomexportmodule.Intherawdecodeddataoutput,folder
CellStatsDircontainsa folder for each versionof segmentationapplied,and within those,
SegmentationManifest_Parameters_[SetID]_[version].jsoncontainsthecellsegmentationsetinformationfor
everyresegmentedFOV.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 19
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CellSegmentation
[Page 20]
CellSegmentationConfigurations
Configuration A
Defaultfornon-neuralhumantissues
CellDilation(µm) 2.16
CellDiameter(µm) 8.28
NucleusDiameter(µm) 7.2
HumanBreastTissue
Configuration B
FormouseandhumanneuraltissuewithRNA assays
CellDilation(µm) 0.72
CellDiameter(µm) 14.4
NucleusDiameter(µm) 10.8
HumanNeuralTissue(FFPE)(RNAassay)
Configuration C
Fornuclei/cellsslightlylargerthanregularhumancells
(e.g.cellpelletarraysandsomeculturedcells)
CellDilation(µm) 4.32
CellDiameter(µm) 14.4
NucleusDiameter(µm) 10.8
CellPelletArray
Configuration D
Fortissuecontainingverylargecelltypes
(e.g.megakaryocytesorosteosarcoma)
CellDilation(µm) 4.32
CellDiameter(µm) 36
NucleusDiameter(µm) 18
HumanOsteosarcoma
20 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CellSegmentation
MAN-10162-11
[Page 21]
Configuration E
Forheartmuscleorcombinationsoflargeandsmallcells,
e.g.hepatocyteswithsmallerimmunecellsinliver
CellDilation(µm) 4.32
CellDiameter(µm) 29.28
NucleusDiameter(µm) 5.2
HumanHeartMuscle
Configuration F
Formouseneuraltissuewithproteinassays
CellDilation(µm) 0
CellDiameter(µm) 16.56
NucleusDiameter(µm) 6.2
MouseNeuralTissue(Proteinassay)
ConfigurationG
Developedwithlamininmarker.Formuscleandmulti-
nucleatedcells.Optimizedforcross-sections.See
Example6onpage 24foradjustmentsforlongitudinal
sections.
CellDilation(µm) 2.16
CellDiameter(µm) 5.2
NuclearDiameter(µm) 30
MusclewithImmuneCells
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 21
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CellSegmentation
[Page 22]
ExamplesofAdjustmentstoParameters
Example1:SinceDAPIsignalislowandmissinginsomecells,segmentationreliesheavilyonthemembrane
signal.Somecellsarelarger,sothesegmentationconfigurationischangedfromConfigurationAtoD.Thecell
modelischangedfromthedefaultCPtocyto2,whichbetterfitstheelongatedshapesinthemembraneareas
andlowDAPIsignal.
Original:ConfigurationA Adjusted:ConfigurationD+cyto2model
Example2:Normalkidneytissue;somecellshavelittleboundaryinformation.ByreducingtheCellProbabilityand
FlowThresholds,morecellsaresegmentedthatweremissedinthedefaultconfiguration(markedinredonleft).
Original:ConfigurationA Adjusted:CellProbabilityThreshold=-6;
FlowThreshold=0
22 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CellSegmentation
MAN-10162-11
[Page 23]
Example3:Livertumortissue;originalConfigurationEcontainsdilationof4.32µm.Reducebyhalfto2.16µmto
tightentheareaaroundeachcell.
Original:ConfigurationE Adjusted:Reducedilation
Example4:Out-of-focusmembranesignalthatoughttobeincludedinthesegmentation-increasedilation
parameterfromthedefault.
Original:ConfigurationA Adjusted:Increasedilation
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 23
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CellSegmentation
[Page 24]
Example5:Whenstainingispoorand/orcellsarenotinfocus,theCyto2modelmayperformbetterthanthe
defaultCPmodel.Inthisexample,thecellprobabilityandflowthresholdarealsoreducedtoimprove
segmentation.
Original:CPmodel Adjusted:Cyto2model
Example6.Segmentationofdifferenttypesofmusclecell.Formusclecross-sections,ConfigurationGis
recommended,andsegmentationmaybeimproveduponbydesignatingPanCKandCD298/Membranemarkers
asmembranemarkersduringresegmentationinAtoMxSIP.Insomecases,autofluorescenceinthosechannels
mayaddusefulsegmentationinformation.
Forsampleswithlongitudinalmusclesections,oracombinationoflongitudinalandcross-sections,itis
recommendedtoselectConfigurationGon-instrument,thenresegmentinAtoMxSIPwiththeConfigurationG
baseparameter,adjustingthenucleusdiametertobetwicethedefaultvalue,andselectingtheCyto2nuclei
model.
Original:ConfigurationG
Adjusted: Nucleidiameter=60μm,Cyto2model
24 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CellSegmentation
MAN-10162-11
[Page 25]
Example 7. For heart muscle, select Configuration G or E on-instrument, then resegment in AtoMx SIP, excluding
PanCK and CD45 in the segmentation advanced parameters (see page 16). (Note that both Configuration G and E
perform well if laminin is used as a marker with heart muscle. If laminin is not used, Configuration E tends to
perform better.)
Original: Configuration E
Adjusted: Exclude PanCK and CD45 channels
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 25
MAN-10162-11 CosMx SMI Data Analysis User Manual
Cell Segmentation
[Page 26]
CreateandOpenaStudy
Studiescanbecreatedfromdatathathascompletedtransfer(upload)fromtheinstrument.Onlyflowcellsrun
withthesamepanelcanbecombinedinonestudy.
1. FromtheAtoMxSIPgallery,clickthecheckboxonthecard(s)oftheflowcellsofinterest.Iftheobjectismarked
Inprogress,it isnotyetavailableforstudycreation.TheCreateStudybuttoninthetop-rightbecomesactive
(Figure11).
Figure11:SelectaflowcelltoactivatetheCreateStudybutton
2. ClickCreateStudy.
3. In theCreateStudywindow,entertheStudyName,Description,andTagsto annotatethestudy.For
RNA studies,startinginAtoMxSIP v2.2,selectwhethertoruntheclassicconfigurationorSpatialDiscovery
configuration.(SpatialDiscoveryis anexperimentalvisual-firstinterfacethatautomaticallyrunsa Spatial
Discoveryanalysispipeline,describedonpage 37, andpresentsinteractivevisualizationsuponopeningthe
study.)ClickCreate.
Duringstudycreation,flowcelldataareaggregatedandorganizedtoadataobject.Thisprocesstakeslongerfor
studieswithmoreflowcellsandFOV.WhenaSpatialDiscoverystudyiscreated,adefaultpipelineisalsorun,
sothisprocesstakeslongerthanclassicstudycreation.
Toopenthisoranyexistingstudy,navigatetoyour
studiesintheHomegalleryandclickonthestudytoopenthe
CosMxSMI DataAnalysisSuite.
Astudycreationlogrecordsthesoftware'sstepsin
creatingthestudy.Ifneededfortroubleshooting,download
thestudycreationlogfromthePipelineStructurepanel(prior
toanypipelinerun)ortheStudyDetailsPanel(Figure12)of
theDataAnalysisSuite.
Figure12:Downloadstudycreationlogs
26 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
Create/ OpenaStudy
MAN-10162-11
[Page 27]
DeleteaStudy
Todeleteastudy:
1. Clickthe3-dotsmenu(
)onthestudy'scardinthegalleryandselectMore
Options,thenDelete(Figure13).
2. Aconfirmationwindowopenstoconfirmdeletingthestudyfromthegallery.If
confirmed,thestudyismovedtoDeletedObjects,accessiblefromthemenuatleft.
3. Toreviewdeletedobjects,clickDeletedObjects(Figure14).Objectsmaybe
restoredtothegalleryorpermanentlydeleted,dependingonthepermissionlevel
oftheuser,bycheckingtheobject'scheckboxandselectingabuttoninthetop
right.
Deletingacollectiondoesnotdeletetheindividualitemsinthecollection.
Figure14:DeletedObjectsfolderandDelete/Restorebuttons
Figure13:Deleteastudy
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 27
MAN-10162-11 CosMxSMI DataAnalysisUserManual
DeleteaStudy
[Page 28]
OrientationtoCosMxSMI DataAnalysisSuite
CosMxSMI studiescreatedwithSpatialDiscoveryconfigurationopenintheSpatialDiscoveryview(Figure15).
Studiescreatedwithclassicconfigurationopenin theclassicview(Figure16). Seestudyconfigurationson
page 26.
Figure15:SpatialDiscoveryviewintheCosMxSMI DataAnalysisSuite.
Figure16:ClassicstudyviewintheCosMxSMI DataAnalysisSuite.
Differentpanels,describedinthefollowingpages,canbedisplayedorhiddentoprovideacustomizedworkspace.
Choosefromtheavailablepresetviews,orcustomizebyclickingAddpanels.Youmayneedtoscrolltotherightto
viewalldisplayedpanels.Minimizeormaximizeapanelusingcontrolsinitstopright.Clickanddragpanelsto
arrangetheviewtoyourliking,thensaveyourcustomizedviewbyclickingSaveview.
28 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
OrientationtoCosMxSMI DataAnalysisSuite
MAN-10162-11
[Page 29]
Spatial Discovery View: Data Overlay
Upon opening a study in Spatial Discovery view, cells are colored by niche based on the
results of the foundational pipeline that has run in the background on study creation
(specifically, the results of the Novae or Neighborhood Analysis module; see page68
).
The default selections and coloring options can be changed using the controls in the Data
Overlay panel ( Figure
17). Instead of coloring by niche, color cells by cell types,
expression of a target, morphology markers, pathways, or total count. Under theFeature
dropdown, select the features to color (such as cell type, target, marker, niche, etc.
depending on the coloring scheme). The Levels dropdown sets the number of Novae
niches to display.
Toggle on transcripts display by clickingTranscriptsat the top of the Image Viewer (refer
back to Figure
15). Once transcripts are displayed, their view settings can be edited.
Use the Cells to display filter to select a subset of cells to display in the image. For
example, to show only cell types identified as Cluster 2 in the Leiden Clustering module
(Figure 17), toggle to Filtered; select Variable: Cell types ; select Source: Leiden
Clustering; and selectFeature: 2. Only those cells are displayed. Figure 17: Data Overlay
panel in Spatial
Discovery view
Click Spatial Discovery at the bottom of the Data Overlay panel to launch Spatial
Discovery Study Insights (Figure 18). This downloadable package includes a curated set of
outputs from the current study, designed for use with your preferred Large Language
Model (LLM). A ReadMe file is included, with details about the file contents, and users are
also directed to the Guide to Using the Spatial Discovery Output with LLMs hosted in
ScratchSpace.
Figure 18: Launch Spatial Discovery
Study Insights
Spatial Discovery View: Study Data
The Study Data panel displays the Study Details and the Flow Cell and Pipeline Run panels. These panels are also a
part of the classic study view and are described on the following pages.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 29
MAN-10162-11 CosMx SMI Data Analysis User Manual
Spatial Discovery View: Data Overlay, Study Data
[Page 30]
Study Details Panel
Includes information about the study, including the number of Fields of View
(FOV) analyzed, the number of cells and genes detected, and the version of
AtoMx in which the study was created (Figure 19). ClickDetails and logs to
access more study information. ClickExport to launch the Export Dataset
dialog (seeExport Data on page 77) orShow Export Details to review a
completed export job.
Flow Cells Panel
Indicates which FOV is under analysis and the applicable cell segmentation
profile (Figure 20). Click thepencil icon to open the Manage Annotations
window (see Manage Annotations on page 36). Click theSearch (magnifying
glass) icon to search for a particular FOV by name. For a particular flow cell,
click the arrow (carat) to view a list of the FOVs in the flow cell.Uncheck the
box next to an FOV to de-select it from the Image Viewer panel.
Pipeline Run Panel
Displays executed pipelines (Figure 21). Click the arrow (carat) to display
modules comprising the pipelines. Click the pipeline icon
to run a new
pipeline (seeRun a Pipeline on page 37). Click the custom modules icon
to open Custom Modules (see Custom Modules on page 41).
Click the trash icon to delete the pipeline from the Pipeline Run panel.
Pipeline run names must begin with a letter, not a number.
Figure 19: Study Details panel
Figure 20: Flow Cells panel
Figure 21: Pipeline Run panel
30 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Study Details, Flow Cells, and Pipeline Run Panels
MAN-10162-11
[Page 31]
Pipeline Structure Panel
Presents a diagram of the data analysis pipeline selected in the
Pipeline Run panel. Individual pipeline modules are depicted as blocks
(Figure 22) with icons representing the following tools:
Information about the module
Show resource metrics (execution time, peak memory,
average memory, peak CPU, average CPU) with option to
download metrics and logs
Mark for interactive run (for custom modules with live
interaction; opens live terminal while module runs) or to disable
interactive mode
Rerun step
Show settings (description, input parameters, output
visualizations)
Download files associated with module output (if available)
Numbers on the module blocks indicate the order of operations in the pipeline run. When a workflow is running, its
status is depicted at the bottom of the Pipeline Structure panel. Completed modules are displayed with a check
mark, while modules in progress are striped and modules pending an upstream step are marked Pending. Modules
which failed are red.
For descriptions of each module, please seeCosMx
SMI Pipeline
Modules on page 46.
Pipeline Data Panel
Displays data visualizations for the module selected in the Pipeline
Structure panel. Available visualization types vary depending on
the module run. For example, the Normalization module enables
Study Statistics table, XY plot, Heatmap, Box plot, Violin plot, and
Histogram (Figure
23) while the PCA module enables Study
Statistics table and PCA plot. For descriptions of all visualization
types and their customizable settings, please seeCosMx
SMI
Data Visualizationson page 70.
Figure 22: Pipeline Structure panel
Figure 23: Pipeline Data panel - honeycomb plot
example
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 31
MAN-10162-11 CosMx SMI Data Analysis User Manual
Pipeline Structure Panel, Pipeline Data Panel
[Page 32]
ImageViewerPanel
Displaysthe tissueimageandcellsegmentation
(Figure24),withtheoptiontooverlaydata.FOVsare
borderedbywhiteboxes.
Togglethe minimapandchannellegenddisplay.
Modifythedisplaycolorsbyhoveringonthechannel
name,thenclickingthepeniconthatappears.Pan
acrossthe imageby clickinganddraggingin the
minimapormain(larger)image.Usethezoomcontrol
buttons(+–)ormousescrollwheeltozoom.Expand
tofullscreenbyclickingtheiconinthetop-rightofthe
ImageViewerpanel.
Figure24:ImageViewerpaneldisplaystissueandspatialdata.
Selecttheflowcellto displayfromthedropdown
menu.ViewtheentireimageorselectspecificFOV.
Selectdatatooverlay(cellsand/ortranscripts).Upto
25genes'transcriptscanbedisplayedatonce.Select
fromadditionaldisplayoptions:forCellsoverlay,color
by CellTypes,Expression,MorphologyMarkers,
Pathways,orTotalCount;selectwhichpipelinestep's
datato display,andwhichcelltype(s)orgene(s)to
display.ForTranscriptsdata,selectwhichtarget(s)to
display(upto25).Seeexamplesofthesedataoverlay
optionsinFigure24throughFigure29. Clickthecolor
palettetoadjustthedisplaycolorsorassigncustom
colors.
Clickthemenuiconin topleft to opentheImage
Viewermenu, withtabsforflowcellinformation,FOVs(withtheoptiontohideoutlinesorlabels),imagelayers
(withtheoptiontohidelayers),rendersettings(withtheoptiontohidechannels,adjustintensitycolorscaling,and
modifyon-screencolormapping),andexportimages(seealsoExportDataonpage77).Withsoftwarev2.1,the
ImageLayerstabhasbeenupdatedtoincludeopacityandvisibilitycontrolsforthecellsegmentationlayer.
32 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
ImageViewerPanel
MAN-10162-11
[Page 33]
Recommended Data Overlays and Interactivity using the Image Viewer Panel
Figure 25: Overlay cell transcripts
Figure 26: Overlay cell types (Leiden clustering module
output)
Figure 27: Overlay cell types (Cell Typing module
output)
Figure 28: Overlay cell niches (Neighborhood Analysis
module output)
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 33
MAN-10162-11 CosMx SMI Data Analysis User Manual
Image Viewer Panel
[Page 34]
Figure 29: (Top) Overlay cell types in Image Viewer and display UMAP in adjacent Pipeline Data panel.
Toggle 'Enable Selection' on in Pipeline Data panel and use lasso tool to select cells of interest on UMAP, to
visualize those cells in the tissue (bottom).
Figure 30 shows the orientation of the tissue in the Image Viewer panel.
Figure 30: Tissue orientation from the glass slide to the Image Viewer
34 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Image Viewer Panel
MAN-10162-11
[Page 35]
Annotation Editor Panel
Figure 31: Annotation Editor panel
View selected flow cells and FOVs (Figure 31). Click the
icon
to add flow cell or FOV attributes, which appear as new columns
in the table. Click on a column's 3-dots menu to rename, delete,
or sort by the column.
Export the displayed FOV information and attributes by clicking
the download icon.
See more information in Manage
Annotations on page 36.
Data Viewer Panel
Displays one or two data visualizations in separate panes (Figure
32). Click the icons in the panel header to toggle between a single
or paired display. For each visualization, select the pipeline step to
display from the dropdown menu. Customize the display by
selecting from the available dropdown menus. Click the arrow
(carat) to open additional customizable settings.
Figure 32: Data Viewer panel
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 35
MAN-10162-11 CosMx SMI Data Analysis User Manual
Annotation Editor Panel, Data Viewer Panel
[Page 36]
ManageAnnotations
TheAnnotationEditordescribedhereannotatesdataatthelevelofFOVsorflowcells.Theseannotationscarry
forwardintothestudymetadataintheCosMxSMI DataAnalysisSuiteforusewiththeCellTyping(InSituType)and
DifferentialExpressionmodules.TheycanalsobedownloadedandusedoutsideoftheAtoMxplatform,e.g.as
inputstoexternaldataanalysispipelines.
Toannotatedataatthecelllevel,usetheSampleMetadatacustommodules(seeCustomModulesonpage41).
ToopenManageAnnotations,selectManageAnnotationsfromthedropdowninSpatialDiscoveryview(referback
toFigure15)orclickthepenciliconfromtheFlowCellspanel(referbacktoFigure20).TheImageViewerand
AnnotationEditoropen(Figure33).
Figure33:Manageannotationswindow
ReviewthefeaturesoftheImageViewerinthesectionImageViewerPanelonpage32.
IntheAnnotationEditor, viewavailableflowcellsandFOVs.ClickShowSelectedFOVstolistonlytheFOVs
highlightedintheImageViewer.(UseCtrl+ ClicktohighlightmultipleFOVsofinterestortoggletheHighlight
visibleFOVsbuttontoautomaticallyhighlightallFOVsshowingintheImageViewer.Seetheyellowhighlighted
FOVsinFigure33.)Clickthe+icontoaddflowcellorFOVattributes,whichappearasadditionalcolumnsinthe
table(referbacktoFigure31).Clickonacolumn's3-dotsmenutorename,delete,orsortbythecolumn.Once
saved,columnnamescannotbechanged.
Annotationscannotbeaddedwhileapipelineisrunning.Avoidusingspecialcharactersandspaces.Donotusethe
UpdateSampleMetadatacustommoduletoaddFOV-levelannotationsifusingtheAnnotationEditorinthesame
study.
ExportthedisplayedFOVinformationandattributesbyclickingthedownloadicon.
ClickSaveChangeswhendoneeditingannotationstoreturntotheDataAnalysissuite.Toexitwithoutsaving,
clickthexinthetopright.
Annotationsareappliedonlywithinthepresentstudy;theydonotcarryovertonewstudies.
36 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
ManageAnnotations
MAN-10162-11
[Page 37]
Run a Pipeline
For Spatial Discovery studies, theSpatial Discovery
pipeline (Figure 34) runs automatically upon study
creation. This pipeline consists of the modules Initial
Data, Quality Control, Normalization, Pathway Analysis,
Gene Selection, PCA, UMAP, Neighbor Network:
Expression Space, Leiden Clustering, Neighborhood
Analysis, Novae, Identify Marker Genes,
Spatial Discovery, and Differential Expression. (The
pipeline pauses at the Differential Expression module to
allow the parameters to be set by the user.) Read more
about the modules in CosMx
SMI Pipeline Modules on
page 46.
The modules and their order cannot be customized.
Figure 34: Spatial Discovery data analysis pipeline.
For classic configuration studies, to create and customize a pipeline, click the Pipeline icon
in the Pipeline Run
panel (refer back to Figure 16) to open the Run Pipeline window ( Figure 35). Enter a run name that begins with a
letter, not a number (please note this new requirement in AtoMx v2.0 and later).
Figure 35: Run Pipeline window, displaying the foundational pipeline for RNA data.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 37
MAN-10162-11 CosMx SMI Data Analysis User Manual
Create and Run a Pipeline
[Page 38]
SelectorEditaDefinedPipeline
Thedropdownmenudisplayspreviouslybuiltandsaveddataanalysispipelines,includingthefoundational
pipelinesforRNA andProtein.ThesepipelinesconsistofthemodulesQualityControl(QC),Normalization,PCA,
UMAP,CellTyping,NeighborNetwork:ExpressionSpace,LeidenClustering,NeighborhoodAnalysis,Identify
MarkerGenes,and(forProtein)CellTypeCo-localization(Figure35).ReadmoreaboutthemodulesinCosMxSMI
PipelineModulesonpage46.
Afterselectingapipelinefromthedropdownmenu,clickthepencilicontoedititorthetrashicontodeleteit(not
availableforthefoundationalpipelines).
Thefoundationalpipelineisagoodplacetostartifyouarenewtospatialdataanalysisorarejustbeginningto
exploreyourdataset.Itmaynotsuitalldatasetsandexperimentaldesignstrategies.Neitherthesoftwarenor
thisusermanualintendto prescribethe"right"wayto analyzeyourdata,andanalysisshouldultimatelybe
customizedtotheexperimentaldesignofyourstudies.
Ifyoueditanexistingpipelineandchangethename,thesoftwareoverwritestheoriginalpipelinewiththeedits
andnewname.Itdoesnot"SaveAs"acopyofthepipeline.
38 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CreateandRunaPipeline
MAN-10162-11
[Page 39]
Create a New Pipeline
Click Create Pipeline to build a new data analysis pipeline. In the Create New Pipeline window (Figure 36), enter a
pipeline name. Build your customized pipeline by dragging modules from the Toolbox into the gridded workspace.
Figure 36: Create New Pipeline window
Click and drag the grid to pan around the workspace, and use the mouse scroll wheel to zoom. Hover over or click
on a module in the workspace to summon arrows to create connectors between modules ( Figure 37); click the
arrow, then the target module to draw a connecting arrow. If the module dependency prerequisites do not allow
you to connect two modules, a notification appears in the top right. Review the module dependency prerequisites
listed for each module inCosMx
SMI Pipeline Modules on page 46. Number icons indicate the order of operations.
Hover over the info icon ⓘ to show information about the module, and
click the gear icon
to show its settings. Detailed descriptions of each
module can be found inCosMx SMI Pipeline Modules on page 46.
To remove a module or connector from the pipeline, click on it in the
workspace, then click its red x.
Figure 37: Arrows create links between
modulesClick Save to save the pipeline (enabling you to close the window or click
Backwithout losing your work).
Once the pipeline is constructed, click Save & Create Run. Enter a name
for the pipeline run. Back in the Pipeline Structure panel, click Run All to begin the run. Follow the progress of the
pipeline run as describedon
page 31.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 39
MAN-10162-11 CosMx SMI Data Analysis User Manual
Create and Run a Pipeline
[Page 40]
Asinglestepmaybere-run,ifdesired(iftestingdifferentparametersorincaseofpipelineerror).Anoptionwillbe
presentedtoRerunsteporRerunstepwithchildren(toalsorunthesubsequentsteps).Ineithercase,anew
branchofthepipelinewillbecreated.If theboxforStartpipelinebranchexecutionischecked,thestepwill
beginrunningimmediately.
Pleasenote,are-runcustommodulecannotberunin
interactivemode(seePipelineStructurePanelon
page31).
Figure38:Retrypipelinestepdialog
HowtoRunDataAnalysisPipelinesThatRequireParameterInputs
ThedataanalysismodulesCellTyping(CELESTA),CellTypeQC,Ligand-ReceptorAnalysis,andDifferential
Expressionrequireparameterinputsbeforerunning,causingthedataanalysispipelineto pauseuntilthis
informationisprovided.Ifyouneedtoseetheoutputfromupstreammodulestosettheseparameters,thereare
twowaystoaccomplishthis.Thefirstmethodistosetupthepipelinewiththemodulesnumberedsuchthat
modulesrequiringinputarelast.Manuallyrun(viathe"play"button)eachupstreammoduleuntilsatisfiedwithits
output.Then,runthedownstreammodules.
Thesecondmethodistorunshort,preliminarypipelinestohelpdeterminethenecessaryinputparametersforthe
modules.Onceidentified,youcanconstructthefinaldataanalysispipelinewiththeappropriateparametersalready
established.
Forinstance,theCellTypeQCmoduleallowsyoutorename,merge,ordeletecelltypeclustersgeneratedbythe
CellTyping(InSituType)module.Thismodulerequiresuserinput,suchasnewnames,orclusternumberstomerge
ordelete,whichcanonlybedeterminedaftertheCellTyping(InSituType)modulecompletesandits output
evaluated.Therefore,it is advisableto runtheupstreammodules(orseparatetestpipeline):InitialData,QC,
Normalization,andCellTyping(InSituType).EvaluatetheresultstodeterminetheappropriateinputfortheCellType
QCmodule-whichclusters,ifany,oughttoberenamed,merged,deleted,orsubclustered.Withthisinformation,
youcanconfidentlyrunthedownstreammodules(orbuildthefinaldataanalysispipeline)withtheappropriateinput
parameters.
Whilenotrequiringparameterinputstorun,theNormalization,UMAP,andCellTyping(InSituType)moduleshave
parametersthatshouldbereviewedbeforerunningtoensurethattheyfittheexperimentalandanalyticaldesign.
Refertothearticle"TipswhenperformingCosMxSMI dataanalysiswithAtoMx SIP".
40 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CreateandRunaPipeline
MAN-10162-11
[Page 41]
CustomModules
Custommodulesareoptionaltoolsavailableforadvanceduserswithsomecomputational/codingbackground.The
SpatialDiscoverypipelinecannotbemodifiedwithcustommodules.
Forthelatestcustommodules,pleasechecktheCosMxDataAnalysisModulespageonGitHub. Ofnote,the
SampleMetadatacustommodulesenabledataannotationatthecelllevel,whichcanthenbeusedtosubsetdatain
theCellTyping(InSituType)orDifferentialExpressionmodulesofananalysispipeline,orusedinanexternaldata
analysispipeline.(InAtoMxv2.0andlater,annotationattheflowcellorFOVlevelcanbeachievedfollowing
ManageAnnotationsonpage36).
FromthePipelineRunpanel(referbacktoFigure16),clickthecustommodulesicon
. TheCustomModules
windowopens(Figure39).Editordeleteexistingcustommoduleswiththepencilandtrashicons,respectively.
Figure39:CustomModuleswindowwithtwocustommodulesloaded
Toaddanewcustommoduletothepipeline,clickAddModule.TheAddNewModulewindowopens(Figure40).
Enterthename(required;notethatcharacters/\&%arenotallowed),description,andparametersforrunning.
Uploadthescriptfile(s)whereindicated,andsettheentrypoint(themainexecutablefile).
DesignatetheRversionandRAM,Maxruntime,andCPUcorespecificationsforthecustommodule.Usingmore
than16coreswillcauseCPUthrottlingandmayleadtodiminishingreturns.
In theArgumentssection,click+ to addtheargumentsorvariablesdefinedin thescriptthatyouwantto
manipulateinthecustommodule.Settheirtype(Numerical,String,Boolean,Private,orFile),name(exactlymatch
thevariableinthescript),displayname(howitshouldappearintheuserinterface),therangeofallowedvaluesfor
thisargument,thedefaultvalue,andwhetherornottheargumentshouldberequiredinthemodule.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 41
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CustomModules
[Page 42]
Figure40:Addnewmodulewindow
Privateargumentsarethosecontainingsensitiveinformation,suchasAWScredentials.Privateargumentswillnot
appearintheuserinterfaceorlogfiles.
Forafileargumenttype,acceptablefileformatsare.csv,.txt,.RData,.RDS,and.rda.Thefilemustbelocal(notin
S3).
InthePackagessection,click+ to inputpackageinformationincludingrepository(CRAN,Bioconductor,orR-
Forge),package,andversion.
Foranexampleofacustommodule'sinputparameters,seeExportDataonpage77.
ClickSavetoexitandaddthedefinedmoduletotheCustomModuleslist,orexitwithoutsavingbyclickingthexin
thetopright.
NOTE:If receivinganerrorsuchas"invalidmultibytecharacteratline3",checktheRscriptforformatted
quotationmarkswhicharenotacceptedbytheSMIDataAnalysissoftware.Quotationmarksmaybe""butnot
“ ”.
If acustommodulerunningininteractivemodeisstuckinpending,de-selectinteractivemodeandrunthe
pipelineagain.Seeinteractivemodedescriptiononpage 31.
42 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CustomModules
MAN-10162-11
[Page 43]
Recommendationsfor AnalyzingCosMxWholeTranscriptome
(WTX) Data
AcommonanalysispathforWTX dataisoutlinedbelow.RememberthatanalysisintheCosMxSMIDataAnalysis
Suiteshouldbecustomizedtotheexperimentaldesignofyourstudies.Neitherthesoftwarenorthisusermanual
intendtoprescribethe"right"waytoanalyzeyourdata.AdditionalresourcesforanalyzingWTX dataareonlineinthe
CosMxAnalysisScratchSpace.
1. Evaluatethetissueimageforstaining,segmentation,andtissueintegrity.Ifneeded,considerresegmentingto
findabetterfit ofcellboundariesforyourparticulartissue/sample(seeCellSegmentationonpage14).Blurry
areasoftheimagemaybeindicativeoftissueliftingawayfromtheslide.ConsiderexcludingtheseFOVsfrom
downstreamanalyses.
2. QualityControl
l AfterrunningtheQCmoduleinAtoMx,usetheRNA QCPlotcustommoduletocalculatetheSignal-to-Noise
Ratio(SNR).(Fromthemoduleoutput,openthefilePer_FOV_data_quality_metricsandlocatethecolumn
Mean_transcripts_per_cell_per_FOV.Divideeachvaluebythenumberofgenesinthepanel(18,934forWTX)
togivethemeantranscriptsperplexpercellperFOV.Dividethisvaluebythemean_negative_probe_per_
plex_per_cell_per_FOVcolumn,togivetheSNRatFOV-level.)EvaluatingallFOVtogether,lookforSNR>~2.
KeepinmindthatSNRdependsonmanyfactorsincludingsamplequality(sampleischemictime,fixation
method,blockage,sectionpreparation,sectionage,sampleprocessing,digestionconditions,etc.)
l Atthecelllevel,lookfortranscriptspercell> ~200.If toomanycellsareflagged(30%ormore),consider
reducingthisthreshold.Transcriptspercelldependsonmanyfactorsinstudydesignandsamplebiology.
3. Normalization
l TotalCountsNormalizationis generallyrecommendedasit keepsthedataona linearscale,is easily
interpretable,andisquicktorun.
l Othertransformations(log1p,Pearson,sctransform)arepossibleandmayimprovevisualizationsinsome
datasets.However,thePearsonmethodisveryresource-andtime-intensive,soit isonlyrecommendedfor
smallerdatasets(usinglowerplexpanelsthanWTX).
4. GeneSelection
l Selectionof3000highlyvariablegenes(HVGs)isrecommendedfordownstreamprocessing.Refertothe
moduledescriptionforGeneSelection-RNAonpage51.
5. PCA
l Calculate50principalcomponentsfromnormalizedcountsforselectedHVGs.
6. UMAP
l Recommendedparameters(optimalparametersareproject-dependent;thesearesuggestedasastarting
point):
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 43
MAN-10162-11 CosMxSMI DataAnalysisUserManual
RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data
[Page 44]
o Minimumdistance=0.01;lowerminimumdistancegeneratesmoreclusters.
o Spread=5or2;higherspreadyieldsmoreseparationofclusters.
o Neighbors=30;keepbetween20-40;highervalueyieldsmoredistinctclusters.
o Metric=cosine.
o Usebetween15-50principalcomponents.
o Suggestedresource:https://pair-code.github.io/understanding-umap/.
7. CellTyping(LeidenClusteringorInSituType)
l LeidenClusteringgivesanideaoftheoverallstructureofthedataset.Adjustthe"Resolution" parameterif
desired(highernumericalvaluegeneratesmore,smallerclusters).
l ForCellTyping(InSituType),fully-orsemi-supervisedcelltypingisrecommended.Thismethodispreferableto
LeidenClusteringifyouhaveagoodreferenceprofile,ideallyderivedfromCosMxSMI data(seetheCosMx
SMI-basedprofilesathttps://github.com/Nanostring-Biostats/CosMx-Cell-ProfilesorscRNA-seq-basedprofiles
at https://github.com/Nanostring-Biostats/CellProfileLibrary. If usingscRNA-seq-basedprofiles,correct
platformeffectsbyselectingRescale=TRUEintheInSituTypemodule).
l AsubsetofthetotaldatasetcanbedesignatedforinputintotheCellTyping(InSituType)module(seeCell
Typing(InSituType)Inputparameters:onpage56). Thiscanhelpdataanalysispipelinesbymakingthe
workingdatasetabitsmaller.CellsthatareexcludedfromCellTypinganalysiswillhavethelabel"NA"inthe
result.
8. Neighborhood(Niche)Analysis
l NeighborhoodAnalysisoperatesonthecell-typelevel,notthetranscriptlevel(i.e.it isbasedoncell-typing
calls,notgeneexpression).Agoodstartingpointisoften7 nicheswith50-micronradius.Thisanalysisis
typicallyquicktorunsoit isrecommendedtoiteratefromthestartingpointtofindtheparametersthatsuit
yourexperimentaldesign.
If runningDifferentialExpression(DE)analysisin largeWTXstudies,consideraskingcelltype-specificDE
questions,subsetting,andrunningseparateDEanalysesbycelltype,astheyareoftenthemostinterestingand
directwaytointerpretDEresultsconcerningindividualgenes.TosubsetdataforinputintotheDEmodule,seethe
DifferentialExpressionInputParametersonpage 67.
ForDEquestionsthatrequireusinga fulldataset,timeto fit regressionmodelsdependsonseveralfactors,
includingthenumberofcells,complexityofmodelformulaandnumberofincludedcovariates,anddistribution
family(normaldistribution(gaussian)/ negativebinomial(nbinom2)).IfrunningDEwithmillionsofcellsatonceis
necessary,therecouldbeconsiderabletimespentonmodelfittingpergene.Forcontext,asmallbenchmarkingof
running10millioncellsatonce,for1thousandgenes,withasimplemodelformulausingnegativebinomialmixed
modeldesignedtoidentifygenesDEamongspatialnichesacrossallcells(below)took11hoursforfullmodule
completion,oraround0.66minutespergene:
~RankNorm(otherct_expr) + (1|Run_Tissue_name) + spatialClusteringAssignments+
offset(log(nCount_RNA)),
Forcomparison,thesame10millioncellsrunusingagaussian/ normaldistributionwithnormalizeddata,and
replacingtherandomeffectwithafixedeffect(below)tookslightlylongerat13.5hours,or0.81minutespergene:
44 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data
MAN-10162-11
[Page 45]
~RankNorm(otherct_expr)+ Run_Tissue_name+ spatialClusteringAssignments,
Ingeneral,modelswithrandomeffectswilltakelongerforfittingthanmodelswithfixedeffects.Negativebinomial
modelfittingisslowerthanpoissonmodelfitting,whichinturnisslowerthangaussianmodelfittingforalmostall
packageimplementationsoftheseregressionroutines.
Anexceptiontothisgeneralguidanceisthatnegativebinomial(andpoisson)mixedeffectsmodelsarefitinAtoMx
SIPusingthe`nebula`R package(Heet.al.2021), whichis exceptionallyfast,andis thereasonforfaster
benchmarkingdescribedabove.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 45
MAN-10162-11 CosMxSMI DataAnalysisUserManual
RecommendationsforAnalyzingCosMxWholeTranscriptome (WTX) Data
[Page 46]
CosMxSMIPipelineModules
CosMxSMI dataanalysispipelinesarecomprisedof differentmodulesto customizeanalysisaccordingto the
experimentaldesign.Modulesdonotnecessarilyneedto berunin theorderlistedin thefollowingpages.
Prerequisitesaredefinedineachmodule'sdescription.SeeHowto RunDataAnalysisPipelinesThatRequire
ParameterInputsonpage40forguidancetotailorpipelinestoyourneeds.SeeAppendixI: LiteratureReferences
onpage86forapplicableliteraturereferences.
PleasenotethatanalysisofcustomproteintargetsreliesontheselectionofthecorrectcustomSPKfileatthe
timeofrunset-upontheCosMxSMIinstrument.Pleasecontactsupport.spatial@ bruker.comforadditionalsupport.
Analysisin theCosMxSMIDataAnalysisSuiteshouldbecustomizedto theexperimentaldesignof your
studies.Neitherthesoftwarenorthisusermanualintendtoprescribethe"right"waytoanalyzeyourdata.
InitialData
Prerequisitemodules:None
Moduledescription: CosMxSMI dataisloadedintothepipelineorchestratorfordownstreamanalysis.
Outputvisualizations:Studystatisticstable(seeCosMxSMIDataVisualizationsonpage70).SeeTable4 for
studystatisticscalculations.
Value Calculation(FOVlevel) Calculation(Flowcelllevel)
Meantranscriptper
cell
Averageofalltranscript-per-cellvaluesforthe
FOV
Averageofalltranscript-per-cellvalues
fortheflowcell
Meanuniquegenes
percell
Averageofallunique-genes-per-cellvaluesforthe
FOV
Averageofallunique-genes-per-cell
valuesfortheflowcell
10thpercentile
transcriptpercell
10thpercentileofalltranscript-per-cellvaluesfor
theFOV
10thpercentileofalltranscript-per-cell
valuesfortheflowcell
90thpercentile
transcriptpercell
90thpercentileofalltranscript-per-cellvaluesfor
theFOV
90thpercentileofalltranscript-per-cell
valuesfortheflowcell
Meannegprobe
countspercell
SumofnegativeprobecountsfortheFOV/Total
numberofcellsintheFOV
Per-FOVmeannegprobecounts
averagedacrosstheflowcell
Table4:Studystatisticscalculations.
46 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 47]
QualityControl-RNA
Prerequisitemodules:InitialData
Moduledescription:Thismoduleflagsunreliablenegativeprobes,cells,FOVs,andtargetgenes,asdefined
below.Flaggeditemsarenotremovedfromdownstreamanalysis.
l NegativeprobeQCflagsnegativecontrolprobesthatappeartobehavelikeoutliersinthetissue.Thishelps
preventtissue-specificorsample-specificbackgroundeffectsfromimpactingotherQCmetricsthatrelyonthe
negativecontrolprobevalues.Grubb'stestisused,andnegativeprobesthataredesignatedoutliers(according
top-valueparametersetbytheuser)areflagged.
l CellQCflagscellswithloworspurioussignalbasedonthenumberoftargetsdetected(moreisbetter),fraction
ofprobeswhicharenegativecontrols(fewerisbetter),uniformityofcountdistribution(highercomplexityis
preferred),andcellsize(thetoppercentilemayneedtoberemovedasaQCofsegmentation).
l FOVQCidentifiesFOVsthathavegenerallylowexpression.Twoapproachesareavailable:theMeanmethod
flagsFOVswherethetotalcountpercellaveragesbelowathreshold.TheQuantilemethodflagsFOVswherea
highlyexpressedgeneisbelowbackground.Seedetails,below.
l TargetlevelQCflagstargetsthatappeartobebelowbackgroundacrossthedataset,basedonprobedistribution
relativetonegativecontrolprobes.
Custompipelinemodulename(optional):Assignauniquenametoyourpipelinemodule,whichwillbeusedfor
identifyingpipelinemoduleresultsindownstreamanalysis.Ensurethenameisuniquewithinyourstudyanddoes
notexceed100characters.Onlyletters,numbers,underscores(_),hyphens(-),colons(:),andparenthesesare
permitted.Ifleftblank,adefaultnamewillbeassigned.
Inputparameters:
l NegativeprobeQC:
l OutlierP-valuecutoff:(default:0.01;range:0-1)toflagoutliernegativeprobes.
l RemoveQCflaggednegativecontrolprobes(yes/no).(Itisgenerallyrecommendedtobeawareof,butnot
remove,flaggeditemsfromthedataset.)
l CellQC:
l Minimalcountspercell:recommend50or100for6Kpanel;20for1000-plexpanel;5for100-plexpanel;must
be>1.IncreasethethresholdtomakeQCmorestringent.
l Proportionofnegativecounts:flagcellswhere>10%(0.1,thedefaultvalue)ofthecountspercellarenegative
probes.DecreasethethresholdtomakeQCmorestringent.
l Countdistribution:thismetricprovidesameasureoftranscriptioncomplexity.Itcalculatesthe(totalcounts)/
(numberofdetectedgenes).Ifsetto1(default),themodulewillflagcellsinwhichthenumberoftotalcounts
equalsthenumberofdetectedgenes(i.e.eachdetectedgenehas1count).Ifsetto100,themodulewillflag
cellsinwhichthenumberoftotalcountsislessthan100xthenumberofdetectedgenes.Theallowablerange
is1-200butitisrecommendedtoleavethevaluesetat1.
l Areaoutlier:Grubb'stestp-value(default:0.01,range0-1) to flagoutliercellsbasedoncellarea.If
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 47
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 48]
megakaryocytesorothercelltypesknowntohavelargecellareaareinyoursample,keepthep-valuelowto
maketheoutlierdesignationstringent.
l FOV QC:
l FOV QCmethod:mean(default)orquantile.Themeanmethodisbasedontranscriptlevelsasdefinedin
FOV countcutoff, below.Thequantilemethodis basedontherelationshipbetweena highsignaland
background,asdefinedinFOVQC quantile,below.Themethodofchoiceisuptotheuser.
l FOVcountcutoff:onlyappliestoMeanmethod.FlagsFOVthathaveanaveragetranscripts-per-cellbelowthis
value(default:100;range≥ 0).
l FOVQC quantileandFOVquantiletonegativecutoff:onlyappliestoQuantilemethod.FlagsFOVinwhicha
highsignal(e.g.,90thpercentilegenecount)isnotsufficientlyabovebackground(themedianofthenegative
probes)."Sufficientlyabove"isdefinedbytheFOVquantileto negativecutoff.Quantiledefault:0.9(90th
percentile),range0-1.Cutoffdefault:0,mustbe≥ 0.
l TargetlevelQC:
l Negativecontrolprobequantilecutoff:setthethresholdatwhichtoflagprobes.Avalueof0.5(default)will
flagprobeswithlowertotalcountsthanthemedian(50thpercentile)ofthenegativecontrolprobes'counts.
Range0-1.
l Detection:setthep-valueatwhichatargetisdeemedtobeabovebackground(0.01default,range0-1).Itis
recommendedtosetthevalueto0,resultinginnoflagsappliedtothetargets.(Ifyouwishtoadjustthis
parameter,beawarethatasmallerp-valuerequiresthatthetargetishigherabovebackground.Alargerp-value
allowsthetargettobeclosertobackground.)
Outputvisualizations:QCmetricstable(seeTable5),XYplot,heatmap,boxplot,violinplot,andhistogram.See
CosMxSMIDataVisualizationsonpage70.
ExploreyourQCdataset:
l LookatQCdataoverlaidonthetissueimagebyincludingtheImageViewerpanelinyourDataAnalysisSuite
view,selectingthebuttonCellstooverlaycelldata,thenselectingExpressionandStep: QualityControlfrom
thedropdownmenus.Fromthenextdropdownmenu,selectagenewithknownorexpectedexpressioninthe
tissue.Doesit passa'sanitycheck'intermsofitsexpressioninthisregionofthesample?Dovisualization
markers'expressionmatchthetissuemorphology?
48 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 49]
NegativeProbeQCSummary
OutlierTest % Passis% ofNegativeProbeswhichthatpassGrubb'stest(i.e.arenotoutliers)
basedonthep-valuethresholdsetinmoduleparameters.
CellQCSummary
MinimumCounts % Passis % of cellsabovetheminimumcountspercellthresholdsetin module
parameters.
Proportionof
Negatives
%Passis%ofcellswithnegativeprobecountproportionlessthanthethresholdsetin
moduleparameters.
Complexity(count
distribution)
%Passis%ofcellswithratiooftotalcountstothenumberofdetectedgenesgreater
thanthevaluesetinmoduleparameters.
Area %Passis%ofcellswithcellareawhichisnotdesignatedasanoutlieraccordingtothe
thresholdsetinmoduleparameters.
FOV QCSummary
FOVsFlagged %Passis%ofFOVsthatwerenotflaggedinFOV QC,accordingtomoduleparameters
setbyuser.
CellsFlaggedacross
FOVs
%Passis%ofcellsbelongingtoFOVsthatpassedFOV QC.
TargetQC Summary
NegativeControl % Passis % of targetswherethetarget'stotalcountsis greaterthanthedefined
quantileofthenegativecontrolprobestotalcount,setinmoduleparameters.
Detectionover
Background
%Passis%oftargetsabovebackground,accordingtop-valuethresholdsetinmodule
parameters.
Table5:RNAQC metricsandpassdefinitions
Readmoreaboutfactorscontributingtodataqualityinthe"Top3TipsforSuccessfulCosMxSMISingleCellSpatial
Runsat1000-Plex".
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 49
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 50]
QualityControl-Protein
Prerequisitemodules:InitialData
Moduledescription:Thismoduleflagsunreliablecellsbasedonsegmentedcellarea,negativeprobeexpression,
andhigh/lowtargetexpression.CellswithoutlierGrubb'stestp-values<0.01forsegmentedareaareflagged.Cells
withmeannegativeprobevaluesbelowthelowerthresholdorabovetheupperthreshold(asdefinedininput
parameters;seebelow)areflagged.Cellswithoverlyhighorlowtargetexpression(asdefinedininputparameters;
seebelow)areflagged.Flaggedcellsarenotremovedfromthedataset.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l Negativeproberange:flagcellswithnegativeprobemeanbelowthelowerthreshold(default2)orabovethe
upperthreshold(default50).
l Highexpressionproteins'proportionandthreshold:flagcellswhere50%ormore(0.5,default;range0-1)of
proteinsareinthe90thpercentileorhigher(0.9,default;range0-1).TomakeQCmorestringent,decrease
percentageand/ordecreasethreshold.
l Lowexpressionproteins'numberandthreshold:flagcellswherefewerthan3(default;range0-NwhereNis
totalnumberofproteinsinpanel)proteinsareinthe50thpercentileorhigher(0.5,default;range0-1).Tomake
QCmorestringent,increasenumberand/orincreasethreshold.
Outputvisualizations:QCmetricstable(seeTable6);XYplot,heatmap,boxplot,violinplot,andhistogram.See
CosMxSMIDataVisualizationsonpage70.
ExploreyourQCdataset:
l LookatQC'ddataoverlaidonthetissueimage(includetheImageViewerpanelinyourDataAnalysisSuite
view,selectCellstooverlaycelldata,thenExpressionandStep: QualityControlfromthedropdownmenus.
Fromthenextdropdownmenu,selectagenewithknownorexpectedexpressioninthetissue).Doesitpassa
'sanitycheck'intermsofitsexpressioninthisregionofthesample?Dovisualizationmarkers'expressionmatch
thetissuemorphology?
CellQC Summary
Negativeprobes %Passis%ofcellswithnegativeprobemeanwithintherangesetinmodule
parameters
Lowexpression % Passis% ofcellswithenoughproteinexpressiontomeetcriteriasetin
moduleparameters
Highexpression %Passis%ofcellswithfewenoughhigh-expressingproteinstomeetthe
criteriasetinmoduleparameters
Area % Passis% ofcellswithcellareawhichisnotdesignatedasanoutlierin
Grubb'soutliertest(p<0.01)
Table6:ProteinQC metricsandpassdefinitions
50 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 51]
GeneSelection-RNA
Prerequisitemodules:QualityControl
Moduledescription:GeneSelectiondefinesasetofgenesfordownstreamanalysisofRNAdatasets.Itdoesnot
removegenesfromthedataset,butenablesafocusedanalysisincertaindownstreammodules.Genescanbe
selectedbytheuser(byuploadingalist)orautomaticallybasedonthehighlyvariablegenescalculatedbySeurat's
FindVariableFeaturesmethod.Additionalguidancecanbeprovidedtothemodulebynaminggenestoincludeor
excludeintheselectedset.TheresultinggenesetcanbeusedasinputforNormalization,CellTyping(InSituType),
PCA,IdentifyMarkerGenes,andSpatialExpressionAnalysismodules.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l GeneListName(createanameforthissubsetofgenes;onlyletters,numbers,underscoresandperiodsare
allowed,andthenamecannotstartwithanumber)
l SelectIncludeHighlyVariableGenesto subsetbasedontheSeurat-calculatedhighlyvariablegenelist.
OptionallyadjusttheSmoothingSensitivity(default0.3),whichtunesSeuratcallingofhighlyvariablegenes.
Increasingthevalueresultsinmoresmoothing;itwillfavorgeneswithconsistentlyhighvarianceacrossbroader
expressionranges.Decreasingthevalueresultsinlesssmoothing;themodelmaycatchmorehighlyvariable
geneswithinspecific,narrowexpressionwindows,butmayincludegenesthatarevariableduetonoiserather
thanbiologicalsignal.Selectthenumberof highlyvariablegenesto includeand(optionally)selectgenesto
requireorexcludefromthedropdownmenus.
l SelectIncludeUserSelectedGenestosubsetbasedonauser-providedgenelist.Downloadthefullgenelist
byclickingthedownloadicon,editasneeded,anduploadbacktothemodulebyclickingUploadCustomGene
List.
Outputvisualizations:No visualizationsavailable,but the resultinggenelist canbe usedas inputfor
Normalization,CellTyping(InSituType),PCA,IdentifyMarkerGenes,andSpatialExpressionAnalysismodules
(selecttheappropriateGeneListnamewhensettingtheinputparametersforthedownstreammodule;see
additionalnotesinthosemodules'descriptions,below).Theresultinggenelistcanalsobedownloadedafterthe
moduleruns.
Normalization-RNA
Prerequisitemodules:QualityControl
Moduledescription:Generatesnormalizedexpressiondatafromcounts.RNA normalizationadjustsforlibrarysize
factorstoensurethatcell-specifictotaltranscriptabundanceanddistributionofcounts(whichmayvarybetween
someFOVsandbetweensamples)doesnotinfluencedownstreamvisualizationanddataanalysis.Three
normalizationmethodsareavailable:
l Totalcountnormalization(default):Genecountinacell/totalcountsinthecell.
l SeuratusesSeurat::NormalizeData()with"LogNormalize"defaultsetting.(Featurecountsforeachcellaredivided
bythetotalcountsforthatcellandmultipliedbythescalefactor.Thisisthennatural-logtransformedusing
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 51
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 52]
log1p.) See reference inAppendix I: Literature References on page 86.
l Pearson residuals normalization is based on the estimated mean and variance: (gene count in a cell - mean gene
count in the cell) / standard deviation of gene counts in the cell. An overdispersion factor can be specified in the
module. Overdispersion is variance greater than what is predicted by the model. Learn more from tutorials such
as "Regression
with Count Data: Poisson and Negative Binomial". See reference in Appendix I: Literature
References on page 86.
Custom pipeline module name (optional): Used to identify module results in downstream analysis; see full
description on page 47.
Input parameters: Normalization method (select Seurat, Pearson
residuals or Total counts). For Pearson residuals method, set
overdispersion value (see above): default 100; must be ≥ 0. Gene List
Name (if the Gene Selection module has been run, the resulting gene list
is available to select for input to the Normalization module. The
Normalization module will prduce a subset of the expression matrix based
on the subsetted genes).
Output visualizations: XY plot (Figure
41), heatmap, box plot, violin plot,
histogram. See CosMx SMI Data Visualizationson page 70.
Explore your normalized dataset:
l Evaluate the data for normalization bias: overlay normalized data on the
tissue image by including theImage Viewer panel in your Data
Analysis Suite view, selecting the button Cells to overlay cell data, then
Expression and Step: Normalization from the dropdown menus.
From the next dropdown menu, select a housekeeping gene or other
target expected to have even expression throughout the tissue. Is
expression bias observed across FOVs?
Normalized data is used as the input to generate heatmaps, violin plots, box plots, PCA, UMAP, and to visualize
counts on tissue. It is not used as the input to differential expression or other modules that include a
normalization function.
Figure 41: Normalized data in
XY plot, colored by
expression, helps evaluate
normalization bias.
52 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
CosMx SMI Pipeline Modules
MAN-10162-11
[Page 53]
Normalization - Protein
Prerequisite modules:Quality Control (Protein)
Module description: Generates normalized (background- subtracted) protein data from mean fluorescence
intensity (MFI) values. Protein normalization is based on the concepts of:
l Total intensity scaling, to reduce the effect of technical artifacts such as shading or edge effects.
l arcsinh transformation, to improve visualization clarity and stabilize variance across the sample.
Total intensity scaling: Since protein data involves continuous intensities rather than counts, the total intensity
for a cell refers to the sum of (average intensity for each protein) in the cell. This accounts for technical artifacts
where certain parts of the image are brighter or dimmer. Total intensity scaling is essentially converting from an
absolute intensity to a proportion. This proportion then gets scaled back up by the average (across cells) total
intensity. This is similar to RNA normalization in which counts for a given gene in a cell are divided by total counts
across all genes in a cell.
The arcsinh transformation is used to stabilize the variance, so that observations with higher intensity don’t also
have higher variance in that intensity. Arcsinh is a standard data transformation in modern flow cytometry
comprised of linear scaling for values close to zero and logarithmic scaling for larger (negative and positive)
values, with the transition between scales smoothed out. It brings protein data to a more "normal" distribution.
Read more inFinak,
Perez,Weng et al (2010)and Folcarelliand van Staveren et al (2021).
Custom pipeline module name (optional): Used to identify module results in downstream analysis; see full
description on page 47.
Input parameters: Total intensity normalization (yes/no); Transformation (yes/no). Default: 'yes' for all parameters.
Output visualizations: XY plot ( Figure 41), heatmap, box plot, violin plot, histogram. See CosMx SMI Data
Visualizationson page 70.
Explore your normalized dataset:
l Evaluate the data for normalization bias: overlay normalized data on the tissue image (include theImage Viewer
panel in your Data Analysis Suite view, select Cells to overlay cell data, then Expression and
Step: Normalization from the dropdown menus). From the next dropdown menu, select a housekeeping gene
or other target expected to have even expression throughout the tissue. Is expression bias observed across
FOVs?
Normalized data is used as the input to generate heatmaps, violin plots, box plots, PCA, UMAP, and to visualize
counts on tissue. It is not used as the input to differential expression or other modules that include a
normalization function.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 53
MAN-10162-11 CosMx SMI Data Analysis User Manual
CosMx SMI Pipeline Modules
[Page 54]
PrincipalComponentAnalysis(PCA)-RNA orProtein
Prerequisitemodules:Normalization
Moduledescription:PCAprovidesanorthogonallyconstraineddimensionalreductionanalysisofthecountdata
acrossallcellsin thedataset.It producesoutputvalues(principalcomponents,orPCs)representingaxesof
variationwithinthedata,whichareacombinedvalueofweightedexpressioninagivencell.PCsareorderedby
decreasingvariationexplainedinthedata.Thesecanbeusedtobetterunderstandvariationwithinadataset,butare
mostcommonlyusedinsingle-cellanalysisasaninputfortheUMAPanalysis.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:Numberof principalcomponentscalculated(default50;mustbeaninteger≥ 3),GeneList
Name(iftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailabletoselectforinputtothePCA
module.PleasenotethattheNormalizationmoduleautomaticallypassesitsresultsto thePCAmodule,soif
Normalizationwasrunonasubsetofgenes,andPCAfollows,itwillalsorunonthatsubsetofgenesregardlessof
theselectionmadein"GeneListName"inthePCA module.IfNormalizationwasperformedonallgenes,andthe
GeneSelectionmodulefollowed,thenthePCAmodulecanuseeitherthesubsettedgenesfromtheGene
Selectionmoduleorallgenes,assetby'GeneListName'inthePCA module).
Outputvisualizations:PCAplot.SeeCosMxSMIDataVisualizationsonpage70.
ExploreyourPCA dataset:Whileclusteringmaybescrutinizedhere,generally,thePCA datasetfeedsdirectly
intoUMAPandclusteringisevaluatedthere.
UMAP-RNAorProtein
Prerequisitemodules:PCA
Moduledescription:UMAP(UniformManifoldApproximationandProjectionfordimensionreduction)providesa
visualizationofhigh-plexcomplexdatasetsin2-dimensionalspaceusinganon-linearapproachtoestimaterelated
groupsofcellsorfeatures.Thismethodisacommonwayofvisualizingsingle-celldatatoidentifyclustersofrelated
cellswhichmaybefromthesamelineage.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l Numberofneighbors:thenumberofneighboringpointsusedinlocalapproximationsofmanifoldstructure
(default30,range5-50).Increasethevaluetopreserveglobalstructureatthelossofdetailedlocalstructure.
l Minimumdistance:controlshowtightlytheembeddingcompressespointstogether(default0.01,range0.001-
0.5).Increasethevaluetomoreevenlydistributeembeddedpoints;decreasethevaluetoallowthealgorithmto
optimizemoreaccuratelywithregardtolocalstructure.
l Spread:theeffectivescaleofembeddedpoints.Incombinationwithminimumdistance(above),thisparameter
determineshowclusteredtheembeddedpointsare(default5,range0.5-10).Increasevaluetoincreasespread
andreduceclustering.
54 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 55]
l Distance metric: select Cosine (default), Euclidean, Manhattan, or Hamming, to determine the metric used to
measure distance in the input space.
Read more about distance metrics in machine learning inEhsaniand Drabløs(2020).
l Data fraction: set a % of PCA data to use as input. A value of 1 uses 100% of the data and results in a standard
UMAP. A value of 0.25 uses 25% of the data to enable a UMAP projection with less computational burden (i.e.,
faster). Default 0.25, range 0.01-1.
Output visualizations: UMAP plot (displaying data from all FOVs and flow cells in the study). SeeCosMx
SMI Data
Visualizationson page 70.
Explore the UMAP dataset:
l Evaluate the UMAP plot: include thePipeline
Structure panel and Pipeline Data panel in your
Data Analysis Suite view, then select theUMAP
module in the Pipeline Structure panel. Select
different color schemes by clicking the arrow (carat)
to expand customization options. Try coloring the
UMAP plot by tissue annotation, expression of
individual targets, cell type, or total cell transcript
counts, to evaluate clusters.
l Compare the UMAP plot to the XY plot to see where
clusters of cells defined in the UMAP exist in the
tissue: include the Data Viewer panel in the Data
Analysis Suite view, and select step: UMAP for one
visualization and step: Normalization for the other
(Figure
42). Display the Normalization plot as a scatter
plot and color code by cell type. Synchronize the
color coding scheme to allow comparison of certain
cell types in both visualizations.
l Examine patterns of co-expression between cell
markers or targets with known behavior and the
targets of interest to your experimental design.
Figure 42: Use the Data Viewer panel to evaluate UMAP data
(top) compared to expression data displayed in an XY plot
(bottom).
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 55
MAN-10162-11 CosMx SMI Data Analysis User Manual
CosMx SMI Pipeline Modules
[Page 56]
CellTyping(InSituType)-RNA
Prerequisitemodules:QC,Normalization,orPCA
Moduledescription:ThismoduleusestheInSituTypealgorithmtoidentifyandsubsetdatabasedoncelltypes
(seeDanaheretal.2022).ReadmoreaboutthismodulefromtheInSituTypeFAQshostedonGithubandCell
Typing: AdvancedStrategiesarticleintheCosMxAnalysisScratchSpace.Threemethodsareavailable:
l Supervisedclustering:Celltypeassignmentsaremadebasedonareferencematrixspecifyingtheaverage
expressionprofileofeachcelltype.Useoneoftheprovidedreferencematricesorgenerateyourown(see
instructionsingrayboxonpage 46).Aqualityreferencematrixwill:
• Includeallthecelltypespresentinyourtissue.Granularcelltypesarepreferred(e.g.separate
profilesfor“ dendriticcell” ,“ M1macrophage” ,“ M2macrophage” ,etc),butbroadcelltypesare
accepted(e.g.asingle“ myeloid” profile).
• IncludemostofthegenesfromyourCosMxSMIpanel.
• Comefromarobustdataset.Aprofilebasedonjust20cellsfromararecellpopulationwillbe
inaccurate.
l Unsupervisedclustering:Celltypeclustersaredeterminedbythesoftwarewithoutareferencematrixinput;
thencelltypelabelscanbeassignedtoclustersbasedonmarkerexpressionorothercharacteristic.Thesingle
argumentinunsupervisedclusteringisthenumberofclusterstofit.EvaluatetheUMAPtoinferareasonable
numberofclusters;orrelyonthedefaultvalueof10clusterswhichworkswellinmostsettings.Afterevaluating
yourclusteringresults,youhavetheoptiontomergeclosely-relatedclustersorbreakupoverlylargeclusters
(seemoduleCellTypeQC-RNAonpage60).Notethatthenumberofclustersshouldbe> 1andmustbean
integer,notarange.
l Semi-supervisedclustering:A celltypereferencematrixisprovidedto thesoftware,butnotallcellsinthe
datasetmustfit intoa referencecelltypecategory.Theusermayaddtheirowncelltypedefinitions.The
algorithminitiallyfitscellsusingthereferencematrixandcellsthatdonotfitaresplitintonclusters(seeinput
parameters,below).
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l ColumnforSubset:Ifdesired,selectasubsetofcellstoincludeintheCellTypinganalysis,basedonthecolumns
oftheSampleMetadata.csvfile(obtainedusingtheGetSampleMetadatacustommodule;seeCustomModules
onpage41andmodule-specificinstructionswiththecustomscriptinGithub).Onlycolumnscontainingboolean
dataareeligible(true/false,1/0,orpass/fail).
Forexample,toincludeonlycellsthatpassedQC,buildapipelinewithCellTyping(InSituType)downstreamof
theQualityControlmodule.AftertheQCmoduleexecutes,theInSituTypemodulefield"ColumnforSubset"will
populatewiththeavailableQCparameters.Select"qcCellsPassed"fromthisdropdownmenuto runthe
InSituTypemoduleonlyoncellsthatpassedQC.RefertoTable7forColumnforSubsetdefinitions.
Itisalsopossibletosubsetbasedonmultipleparameters:followthecustommoduleinstructions(inGithub)to
gettheSampleMetadata.csvfile,addanewcolumnsuchas"FOV1-10andpassedQC",andfillinbooleandata
foreachrow(true/false).UsetheUpdateSampleMetadatacustommoduletoapplythemetadatatothestudyand
56 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 57]
enableselectionofthisparameterinthefield"ColumnforSubset".CellsthatareexcludedfromCellTyping
analysiswillhavethelabel"NA"intheresult.
ColumnName Definition
qcFlagsCellCountsCapturescellswithcountspercellgreaterthantheminimumthresholdsetinQC module
parameters
qcFlagsCellPropNegCapturescellswithanacceptableproportionofnegativeprobecountspercell,assetinQC
moduleparameters
qcFlagsCellComplexCapturescellsthatexceedtheminimumcountdistribution(totalcounts/ numberof
detectedgenes)assetinQCmoduleparameters
qcFlagsCellAreaCapturescellsthatarenotoutliersinareaassetinQCmoduleparameters
qcCellsFlagged Capturescellsthatfailanyofthesemetrics
qcCellsPassed Capturescellsthatpassallofthesemetrics
Table7:InSituType'ColumnforSubset'Definitions
l GeneListName:IftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailabletoselectforinput
totheCellTyping(InSituType)module.If 'Allgenes'isselected,thismodulewillrunonallgenesevenif the
upstreamNormalizationmodulewasrunonaGeneSelectionsubset.
l SelectfromSupervised,Unsupervised,orSemi-supervisedClusteringandinputtheBasicParameters:
l IfSupervised,uploadareferencematrixin.csvor.RDataformat(withgenesinrows,celltypesin
columns,andexpressionvaluesfillingthematrix;maxfilesize100MB).Ifuploadingan.RDatafile,
theunderlyingvariablemustbecalledprofile_matrix.Valuesshouldbeuntransformedlinearscale,
startingfrom0.Scalingofcolumnsdoesnotmatter.Seegraybox,below,forinstructionstoobtain
areferencematrix.Thedialog"AwaitingInput"indicatesthereferencematrixisnotyetsuccessfully
uploaded.
l IfUnsupervised,selectnumberofclusterstogenerate(recommend10-20).
l IfSemi-supervised,setthenumberofclusterstowhichthealgorithmwillassigncellsthatdonotfit
thereferencematrix,anduploadareferencematrix(seegraybox,below).Thedialog"Awaiting
Input"indicatesthereferencematrixisnotyetsuccessfullyuploaded.
l Checktheboxtoincludethesegmentationmarkersignalinthecalculation.
Todownloadadefinedreferencematrix,refertotheCosMxSMI-basedprofilesorscRNA-seq-basedprofiles
hostedonGithub.Downloadtheappropriate.RDatafile.(IfusingscRNA-seq-basedprofiles,correctplatform
effectsbyselectingRescale=TRUEintheInSituTypemodule;moredetailsbelow.)
Alternatively,deriveyourownmatrixfromanappropriatesingle-cellRNAseq(scRNA-seq)dataset.Usea
definedreferencematrixasa templatefileto producea .csvor.RDatafilethatwillberecognizedbythe
software.Ifuploadingan.RDatafile,theunderlyingvariablemustbecalledprofile_matrix.(Itisnotneccessary
toscalethevaluestomatchthetemplatefile,aslongasallmatrixdataisfromthesamescRNA-seqdataset.If
combiningscRNA-seqdatasetsto createonematrix,it is importantto scalebetweendatasets).More
informationisavailableattheCosMxCellProfilesScratchSpacearticle.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 57
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 58]
l ReviewtheAdvancedParameterstab.ByleavingtheseparameterssettoFALSE,thesoftwarewillusethe
referenceprofileas-is.BysettinganyoftheseparameterstoTRUE,thesoftwarewillmakeadjustmentstothe
referenceprofileaccordingtoyourselections(moredetailandadecisionflowchartareavailableatInSituType
FAQshostedonGithub.
l Rescale(True/False):Anchorcells(high-confidencecelltypecalls)areidentified,thenusedto
estimategene-by-geneplatformeffects.Theseeffectsarethenusedto rescalethereference
profilestothespaceoftheCosMxdata.Thisisamorecautiousmethodofupdatingyourreference
profiles.
l Refit(True/False):Anchorcellsareidentifiedforeachcelltype,thenusedtore-estimatethecell
type'sexpressionprofile,whollyreplacingtheoriginalprofile.Thisisamoreaggressivemethodfor
updatingyourreferenceprofiles—it canmoreaccuratelycalibratethereferenceprofiles,butit is
alsomorelikelytofail.Refittingtendstoperformbetterwhenrescalingisalsoselected.
l RefineAnchors(True/False):If InsituTypefailsto discoverenoughanchorcellsto performthe
calibration,considersettingthisparametertoTRUEandreducingtheMinAnchorLogLikelihood
Ratio(default0,03,range0.001-0.1)andMinAnchorCosine(default0.1,range0-0.5),tolowerthe
thresholdforanchorcellselection.Thedefaultswereoptimizedfor1K-plexdatasothisadjustment
isoftenneededforhigher-plexstudies.
Outputvisualizations:Heatmapofmarkergenes,flightpathplot,XYplotandUMAP(coloredbytheresultsofthe
CellTypingmodule).Toaccesstheflightpathplot,clicktheimageicon
onthesuccessfully-executedCellTyping
moduleinthePipelineStructurepanel.SeeCosMxSMIDataVisualizationsonpage70.
ExploreyourCellTypingdataset:Oncethecelltypeclustersareprojected,evaluateeachcluster'sspatial
distribution,expressionprofile,andimmunofluorescencevalues.Todoso,includethePipelineStructurepanel,
PipelineDatapanel,andImageViewerpanelinyourDataAnalysisSuiteview,thenselecttheCellTypingmodulein
thePipelineStructurepanel.YoumayalsooverlaycelltypesonthetissueintheImageViewerpanel:selectthe
flowcellandFOV(s)todisplayfromthedropdownmenus;selectCellstooverlaycelldataandStep:CellTyping
(InSituType).Scrutinizethecelltypingresultsbyasking:
l Shouldanycelltypesbemerged,sub-clusteredordeleted?LookattheoutputUMAPplotinthePipelineData
panel(colorby:celltype).CelltypesthatoccupydisparateclustersontheUMAParecandidatesforsplittinginto
sub-clusters.Lookattheflightpathplot(downloadimagefrommoduleinthePipelineStructurepanel)forclusters
withlotsofcellsspreadbetweencentroids,indicatingthatthecelltypesarefrequentlyconfusedwitheach
other,suggestingit maybereasonabletomergethem.Clusterswithpoorconfidencevalues(<90%)thatare
confusedwithdiverseotherclusters,orthathaveverylowaveragecounts,mayreasonablybedeleted.
l Arecelltypesexhibitingtheexpectedimmunofluorescenceresults(e.g.doCD45countsalignwithCD45+
immunofluorescence)?Canyouidentifyknowncelltypesbasedontissuemorphology?
l Supervisedcelltyping:arecelltypescorrectlynamed?Dotheyhavetheexpectedspatialdistribution(basedon
theXYplot)?
l Unsupervisedclustering:whatcelltypesdotheseclusterscorrespondto?Usetheirspatialdistribution(intheXY
plot)toassigncelltypestotheclusters.
Basedonyourobservations,usethemoduleCellTypeQC- RNAonpage60torename,merge,delete,or
subclustercelltypes.
58 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 59]
ExpressionModel(CELESTA)-Protein
Prerequisitemodules:Normalization
Moduledescription:TheProteinCellTyping(CELESTA)algorithmperformscelltypingbytakingintoaccounteach
cell'smarkerexpressionprofileand,if necessary,spatialinformation.Celltypingcallsareguidedbyasignature
matrixthatspecifiesthemarker(s)knowntohavehighorlowexpressionforeachcelltype.AbimodalGaussian
mixturemodelisthenfittoestimatetheprobabilityofeachcellhavinghighexpressionforeachconsideredmarker.
Whentheprobabilityissufficientlyhigh,acellisconsideredan"anchorcell".Whentheprobabilityisnotsufficiently
hightomakeahigh-certaintycelltypecall,thealgorithmalsoconsidersspatialinformationbytakingintoaccount
thecelltypecallsofneighboringcells.Theseareconsidered"indexcells".Theprobabilitythresholdscanbetunedby
changingthetuningparameterinputfiletoincrease(ordecrease)thenumberofagivencelltypebydecreasing(or
increasing)thehigh_expression_threshold_anchororhigh_expression_threshold_indexforanchorandindexcells,
respectively(seeAppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA)on
page87).ThismodulerunsthefirststepoftheCELESTAalgorithm,fittingthebimodalGaussianmixture
model.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l Separatelymodelbyflowcell(checkbox;shouldseparatemodelsbefittedtoeachflowcell?).
l Signaturematrix:thesignaturematrixdefineswhichprotein(s)willbeusedasmarkersforwhichcelltypes.The
softwaredefaultstothesignaturematrixmatchingthedatainthestudy(humanormouse).Ifneeded,download
thedefaultmousesignaturematrixfromGithub/Nanostring-Biostats/CelestaSignatureLibrary,orcreateacustom
signaturematrixforhumanormousedataanduploadittothemodule(seeinstructionsinAppendixII:Createa
SignatureMatrixandTuningParameterFileforCellTyping(CELESTA)onpage87).Maxuploadfilesize100MB.
Outputvisualizations:XYplot(cellscoloredbyprobabilityforeachproteinfrom0-1)andhistogram(distributionof
probabilityforeachprotein).
CellTyping(CELESTA)-Protein
Prerequisitemodules:ExpressionModel(CELESTA)
Moduledescription:TheProteinCellTyping(CELESTA)algorithmperformscelltypingbytakingintoaccounteach
cell'smarkerexpressionprofileand,if necessary,spatialinformation.Celltypingcallsareguidedbyasignature
matrixthatspecifiesthemarker(s)knownto havehigh/lowexpressionforeachcelltype.A bimodalGaussian
mixturemodelisthenfittoestimatetheprobabilityofeachcellhavinghighexpressionforeachconsideredmarker.
Whentheprobabilityissufficientlyhigh,acellisconsideredan"anchorcell".Whentheprobabilityisnotsufficiently
hightomakeahigh-certaintycelltypecall,thealgorithmalsoconsidersspatialinformationbytakingintoaccount
thecelltypecallsofneighboringcells.Theseareconsidered"indexcells".Theprobabilitythresholdscanbetunedby
changingthetuningparameterinputfiletoincrease(ordecrease)thenumberofagivencelltypebydecreasing(or
increasing)thehigh_expression_threshold_anchororhigh_expression_threshold_indexforanchorandindexcells
respectively(seeAppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA)on
page87).ThismodulerunsthesecondstepoftheCELESTAalgorithm,assigningcellstocelltypesusinga
signaturematrixandtuningparameters.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 59
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 60]
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l Signaturematrix:thesignaturematrixdefineswhichprotein(s)willbeusedasmarkersforwhichcelltypes.The
softwaredefaultstothehumansignaturematrix.Foramousestudy,downloadthemousesignaturematrixfrom
Github/Nanostring-Biostats/CelestaSignatureLibraryanduploadittothemodule.Alternatively,createandupload
acustomsignaturematrixforhumanormouse(seeinstructionsinAppendixII:CreateaSignatureMatrixand
TuningParameterFileforCellTyping(CELESTA)onpage87).Maxuploadfilesize100MB.
l Tuningparameter:thethresholdssetinthetuningparameterfileinfluencehowtheCELESTAalgorithmcallscell
typesbasedonthesignaturematrix.Itdefinesthehigh-andlow-thresholdsforanchorandindexcells.Usethe
defaulttuningparameterfile(alreadyloadedinthesoftware)oruploadacustomtuningparameter.csvfile.See
instructionsinAppendixII:CreateaSignatureMatrixandTuningParameterFileforCellTyping(CELESTA)on
page87).Maxuploadfilesize100MB.
l Maximumneighborhoodradius(µm)(0-100;default30).
l Maximumnumberofcellsinneighborhood.
l Spatialweight(0-10;default5)(βvalueinPottsmodeltodeterminehowstronglyspatialinformationisweighted
incelltyping.0indicatesspatialinformationisignored).
l Fastapproximation(yes/no)(useafastandcloseapproximationthatsplitscellsintosmallerspatiallyclustered
groupsof~10,000cellspergroupatatime).
Outputvisualizations:Additionalcolumnsareaddedtothestudymetadatashowingtheresultsfromeachround
ofCELESTAcelltyping(thefinalcelltypingdesignationsareshowninthecolumnfinal_cell_type);XYplotand
UMAPcoloredbyCELESTAcelltypelabel.
ExploreyourCellTypingdataset:PleaseseethepromptsunderExploreyourCellTypingdatasetonpage 58.
CellTypeQC-RNA
Prerequisitemodules:CellTyping(InSituType)
Moduledescription:RefinesclusterresultsfromInSituTypealgorithmbyrenaming,merging,deleting,and/or
subclustering.TheresultsoriginallygeneratedbytheInSituTypealgorithmintheCellTyping-RNA modulewillbe
updatedwiththeoutputoftheCellTypingQCmodule.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l MergeFrom/To:selectaclustertomergewithanother(From),andaclusterintowhichtomerge(To).Canalso
beusedtorenameacluster(e.g.,"a"to"Tumorcell").
l Delete:selectclusterstodelete.Cellswillbere-classifiedusingthebestfitfromremainingclusters.
l Subcluster:selectaclustertobesplitintomultiplenewsubclusters.
l n:Specifythenumberofclusters(n)forthesubclusteringstep,ifselectedabove.
60 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 61]
Outputvisualizations:VisualizationssuchasUMAPwhichcanbecoloredbytheresultsoftheCellTyping-RNA
modulewillbeupdatedtoreflecttheoutputoftheCellTypingQCmodule.
WhenyouruntheCellTypeQCmodule,youarealteringtheCellTyping(InSituType)resultsbymerging,
subclustering,ordeletingclusters.TheoriginalCellTyping(InSituType)outputwillbeoverwrittenandwillno
longerbeavailableforvisualizationanddownstreamanalysis.Inaddition,thepreviouslyspecifiedmodule
parameterswillnotbeavailableforrechecking.PleaserecordtheCellTyping(InSituType)parametersused,if
needed.
NeighborNetwork:ExpressionSpace-RNAorProtein
Prerequisitemodules:PCA
Moduledescription:ConstructstheKNN(k-NearestNeighbor)graphbasedontheEuclideandistanceinPCA
space,thenconstructstheSNN(SharedNearestNetwork)graphwithedgeweightsbetweenanytwocellsbased
onthesharedoverlapintheirlocalneighborhoods(Jaccarddistance)andpruningofdistantedges.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l Jaccardcutoff:setsthestringencyofpruningthedataset,fromavalue0(nopruning)to1(totalpruning).This
valueisthethresholdorcutoffforacceptableJaccardindexwhencomputingtheneighborhoodoverlapforthe
SNNconstruction.Anyedgeswithvalueslessthanorequaltothisvaluewillbesetto0andremovedfromthe
SNN graph.Default:0.06,validrange:0-1.
l Distancemetric:Euclidean(default),Cosine,Manhattan,orHamming.
ReadmoreaboutdistancemetricsinmachinelearninginEhsaniandDrabløs(2020).
Outputvisualizations:None(theoutputdatasetisusedtorunLeidenClusteringbutisnotvisualizeditself).
LeidenClustering-RNA orProtein
Prerequisitemodules:Neighbornetwork:expressionspace
Moduledescription:Leidenclusteringisanunsupervisedclusteringmethodthatisusedtoidentifygroupsofcells
whicharerelatedbasedonhowsimilartheyareinagraphstructure.Clustersaredefinedbymovingcellstoidentify
groupsof cellsthatcanbeaggregatedwithoutchangingtheoverallrelationshipof thegraphandlookingfor
unstablenodeswhichserveasbridgesbetweenrelatedcommunitiestohelpdefinetheboundariesofdifferent
clusters.Theresolutionthatyouselectwilldeterminetheoverallnumberofclustersidentifiedafterrunningthe
algorithm,withlowernumbersidentifyingfewerclusters,andhighernumbersidentifyingmore.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 61
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 62]
Inputparameters:Resolution(theoverallnumberof
clustersidentifiedafterrunningthealgorithm,wherea
lownumberisfewer,largergroupsandahighnumber
ismore,smallergroups(default1,range0.2-3).
Outputvisualizations:Leidenclusterannotationis
generatedandincludedinstudymetadata.If UMAP
hasbeenrun,thedefaultvisualizationoftheUMAP
plotwillbeto colorbyLeidenclusters(Figure43).
LeidenclusterscanalsobeusedtocoloranXYplot.
SeeCosMxSMIDataVisualizationsonpage70.
Figure43:UMAPwithcolorcodingbyLeidenclustering
ExploreyourCellTypingdataset:
l Evaluateeach cluster's spatial distribution,
expressionprofile,andimmunofluorescencevalues.
To do so, includethe PipelineStructurepanel,
PipelineDatapanel,andImageViewerpanelinyourDataAnalysisSuiteview,thenselecttheLeidenClustering
moduleinthePipelineStructurepanel.YoumayalsooverlaycelltypesonthetissueintheImageViewer
panel:selecttheflowcellandFOV(s)todisplayfromthedropdownmenus;selectCellstooverlaycelldataand
Step:LeidenClustering.
l Arecelltypesexhibitingtheexpectedimmunofluorescenceresults(e.g.doCD45countsalignwithCD45+
immunofluorescence)?Canyouidentifyknowncelltypesbasedontissuemorphology?
l Whatcelltypesdotheseclusterscorrespondto?Usetheirspatialdistribution(intheXYplot)toassigncelltypes
totheclusters.
l Evaluateanypreviously-generatedUMAPwiththenewoptiontoColorby:LeidenClustering.
IdentifyMarkerGenes-RNAorProtein
Prerequisitemodules:CellTyping,LeidenClustering,orNeighborhoodAnalysis
Moduledescription:Thismoduleidentifiesmarkersassociatedwitheachcelltypeorclusterpreviouslyidentified
inthedataset.It looksforgenesthatareexpressedabovebackgroundconsistently,butalsomostspecifically
restrictedtoeachcelltypeorclusterwithinthedataset.Themoduleactsoneachgeneindependently.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:GeneListName(iftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailable
toselectforinputtotheIdentifyMarkerGenesmodule).If'Allgenes'isselected,thismodulewillrunonallgenes
eveniftheupstreamNormalizationmodulewasrunonaGeneSelectionsubset.
Outputvisualizations:Resultsmatrixwhichconsistsofgenesxcelltypes(onevalueforeachcelltype/genepair;
valuesareaverageestimatedvalueofgenewithinallcellsmatchingtheID);heatmapofmarkergenesvs.celltypes
scaledacrosscelltypessuchthattheheatmapvalueisthez-scoreofexpressionacrossallcelltypesforagiven
gene.SeeCosMxSMIDataVisualizationsonpage70.
Exploreyouridentifiedmarkergenes:Dowell-characterizedcelltypemarkersappeartobeexpressedonlyin
theircanonicalcelltypes?
62 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 63]
NeighborhoodAnalysis-RNA orProtein
Prerequisitemodules:CellTypingorLeidenClustering
Moduledescription:Thismoduleidentifiesdistinctcellularneighborhoodclusters(niches)basedoncelltype
compositionandXYcoordinates.Thismodulehelpsdefinethestructuralcompositionofatissuebylookingfor
regionaldifferencesincelltypecomposition.Nichescanberepeatedstructuresthatarefrequentlyfoundwithina
tissuebutwhicharenotcontiguous(e.g.glomeruliinthekidney,germinalcentersinthelymphnode)orwhichare
physicallyconnectedacrossatissue(e.g.epitheliallayerinthecolon).
Custompipelinemodulename(optional):Usedto identifymoduleresultsin downstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l Method(eitherRadius(µm)to capturenearestneighborsin space(default:50µm,range:10-500µm)or
NeighboringCellstoindicatethenumberofnearestneighborstoevaluate(default:250,range:10-500)).
l Numberofneighborhoods(clusters)desired:default:10,range:integers≥ 3.
Outputvisualizations:Thereisnotaspecificvisualizationforthismodule,butothervisualizationslikeXYplotor
UMAPcanbecoloredbytheNeighborhoodAnalysisresults,andNeighborhoodAnalysismoduledatacanbe
overlaidonthetissueintheImageViewerpanelbyselectingCells,CellType,andStep:NeighborhoodAnalysis.
SeeCosMxSMIDataVisualizationsonpage70.
Ligand-Receptor(LR)Analysis-RNA
Prerequisitemodules:LeidenClusteringorCellTyping(InSituType)
Moduledescription:Scorespairsofcellsandindividualcellsforligand-receptorsignaling.Ligand-receptortarget
expressioninadjacentcellsisusedtocalculateaco-expressionscore.Atestisthenperformedtodetermineifthe
overallaverageofthescoresforeachligated-receptorpairisenrichedbythespatialarrangementofcells.Specific
celltypescanbedefinedfortheanalysis.Notethata pipelinethatincludesLRAnalysiswillpausepriorto
thismoduletoallowtheusertodesignatetheLeidenClusteringorCellTypingdataastheinputtoLR
Analysis.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:Ligandexpressingcelltype(s);Receptorexpressingcelltype(s);Receptorexpressingcelltype
permutations(default100;mustbeaninteger>0);Calculationmethod(directionalornon-directional).(Directional
countsL1:R1asdistinctfromR1:L1(twopairs)whereasnon-directionalcountsthoseasonepair).
Outputvisualizations:HeatmapwitheachLRpaironthey-axisandflowcellnamesonthex-axis.Significant
enrichmentscoresarecoloreddistinctlyfrominsignificantenrichmentscores.SeeCosMxSMIDataVisualizations
onpage70. (AresultsmatrixofaverageLRscoreandsignificanceofspatialenrichmentforallLRpairsinthe
selectedcelltypesisincludedinthetiledbobject,availablewithdataexport,butisnotavailableinthesoftwareuser
interface).
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 63
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 64]
SpatialNetwork-RNA orProtein
Prerequisitemodules:QualityControl
Moduledescription:Createsanetworkorgraphstructureofthephysicaldistributionofcells.Cellsareconverted
to nodesinthegraph,andconnectionsbetweencells(e.g.nearestneighbors)arerepresentedasedges.The
networkcanbebuiltin oneof threeways:radius-based(allcellsconnectedwithina givenradius),nearest
neighbors,orDelaunaytriangulation.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:Methodofbuildingthenetwork(distance(default),nearestneighbors,orDelaunay):
l If "distance"method,selectradius(µm):radiustoselectcellstocreateedges(default:20µm,
range:10µm-100µm).
l If"nearest"method,inputnumberofnearestneighbors(cells)toevaluate(default:5,range:1-50,
integersonly).
Outputvisualizations:Anadjacencymatrixwithdimensionsnumberofcellsx numberofcells. Eachedgeis
recordedinthematrixastheeuclideandistancebetweenthecells.Novisualizationsareavailable,butthismodule's
outputcanserveastheinputtoothermodules.
CellTypeCo-Localization-RNAorProtein
Prerequisitemodules:CellTypingorLeidenClustering
Moduledescription:Examinesthetendencyofdifferentcelltypestobelocatedneareachother.Eachpairofcell
typesdefinedfromsupervisedorunsupervisedclusteringistestedusingRipley’sK-function(afunctionofthe
distancebetweenthedifferentcelltypes)forwhetherthecells’spatialdistributiondiffersfroma theoretical
Poissonpointprocesswherea cell’slocationis notdependentonanothercell’slocation.Theresultsare
summarizedinaheatmapindicatingwhichcelltypestendtoclustertogetherorisolatefromeachother.Inaddition,
amoregranularviewisshownwhenplottingthepaircorrelationfunctionforagivencelltypepairingasafunctionof
theradius,whichcanrevealspecificdistancesatwhichthecellsofeachtypeareco-localized.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:Radius(radiuswithinwhichtoevaluateneighborcelltypes;0-300µm;default100µm);Stratify
resultsacrossFlowcellsorFOVs.
Outputvisualizations:HeatmapshowingnetdifferenceacrossinputradiibetweentheoreticalandobservedK-
functionvalue;positiveinredindicatingclustering,negativeinblueindicatingseparation.
64 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 65]
PathwayAnalysis-RNA
Prerequisitemodules:Normalization
Moduledescription:Signalingpathwayanalysisiscalculatedonaper-cellbasisusinggenesetsofpre-defined
pathways.TheRpackageAUCell(Aibaretal.2017)isusedtocalculatetherelativeexpressionofdifferentgene
setswithinacell,anddeterminewhetheragenesetcorrespondingtoaparticularpathwayisenriched.Genesets
whichdonothavesufficientcoverage(20%ofgenesingenesetpresentindataset)areexcludedfromanalysis.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:Uploadagenesetfileinformat.gmtusinggenesymbolsastheinput.Learnmoreabout.gmt
fileformatintheGeneSetEnrichmentAnalysisarticlelinkedhere.
Outputvisualizations:Acells-by-genesetmatrixwithestimatedgenesetscoreforeachcellisaddedtothestudy
metadataandaccessiblethroughdataexport.VisualizationslikeXYplotorUMAPcanbecoloredbythePathway
Analysisresults,andPathwayAnalysismoduledatacanbeoverlaidonthetissueintheImageViewerpanelby
selectingCells,CellType,andStep:PathwayAnalysis.SeeCosMxSMIDataVisualizationsonpage70.
SpatialExpressionAnalysis-RNA
Prerequisitemodules:CellTyping(InSituType)
Moduledescription:Identifygeneswithspatiallydependentexpressionpatterns.Thismoduleidentifiesgenes
whichhaveaspatialdistributionthatisnon-uniformthroughoutatissue,andwhichmaybeassociatedwithspecific
tissuestructures,microenvironmentniches,orcelltypes.Themodulealsomeasuresassociatedspatialexpression
betweengeneswhichcanbeusedtogroupgenesintodifferentspatialexpressionpatterns.Thetwostatistics
calculatedrelatedtospatialexpressionpatternsareMoran'sIandLee'sL.
Thismoduledoesnotassumeanyspecificrelationshipbetweenstructuresinthetissue.Geneswithsignificant
countvaluesshouldbevisualizedtodeterminehowtheyarerelatedtothetissuemorphology.
If theinputdatasetexceeds2000genes,themodulerunsonthemostvariable2000genesascalculatedbythe
module.
Custompipelinemodulename(optional):Usedto identifymoduleresultsindownstreamanalysis;seefull
descriptiononpage 47.
Inputparameters:
l Neighborhoodsize(numberofcellsinspatialnetwork:default10;mustbeaninteger>0).
l ReceptorExpressingCellType(selectcelltype(s)ofinterestforthespatialexpressionanalysis.Celltypeswill
becomeavailableaftersuccessfulupstreamclustering,andthatclusteringwilldictatethecelltypesavailablein
thismenu.IftheboxCelltypesrequiredischecked,thenaselectionmustbemadeinthecelltypesdropdown
menu.Ifitisunchecked,thepipelinecontinueswithoutrequiringcelltypeinput).
l GeneListName:IftheGeneSelectionmodulehasbeenrun,theresultinggenelistisavailabletoselectforinput
totheSpatialExpressionAnalysismodule.Thegenelistwillbeusedevenifitexceeds2000genes.If'Allgenes'
isselected,thismodulewillrunonthemostvariable2000genesascalculatedbythemodule.
Outputsandvisualizations:
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 65
MAN-10162-11 CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
[Page 66]
l AresultstablewithMoran'sIvaluesforeachgeneandresultsfromtheMonte-Carlotestforsignificanceofeach
Ivalue.
l Lee'sLassociationmatrixwithgenebygenemeasuresofspatialassociation.
TheseoutputsarenotcurrentlyaccessiblefromAtoMx,butcanbeaccessedfromanexportedtileDBarray.For
moreinformationabouttheoutputofSpatialExpressionAnalysis,refertotheCosMxSMI LiverPublicData
Releasesubsectionat https://nanostring.com/wp-content/uploads/2023/01/LiverPublicDataRelease.html#311_
Spatial_Expression_Analysis.
DifferentialExpression(DE)-RNA
Prerequisitemodules:LeidenClusteringorCellTyping
Moduledescription:ThismoduleperformsDifferentialExpressionanalysisusinggeneralizedlinear(mixed)
modelsforsinglecellexpression.Thismoduleallowstheusertocontrolfortheexpressionofneighboringcellsby
including'neighboringcellexpression'oftheanalyzedgeneasafixed-effectcontrolvariableintheDEmodel.
Controllingforexpressioninneighboringcellsismotivatedbytheobservationthatcellsincloseproximityona
tissuearenotindependent,andcomparisonsofDEbetweengroupsmaybeaffectedbycell-segmentationandcell-
typeuncertainty.Often,single-cellDEanalysesmaytestwhethergenesaredifferentiallyexpressedwithina
specificcell-type.Inpractice,imperfectcellsegmentationcanresultinoverlaporbleed-overoftranscriptsfrom
neighboringcells,whichcanalsoincreaseuncertaintyindownstreamcell-typinganalyses.Forthesereasons,
includingtheexpressionoftheanalyzedgeneinneighboringcellscanbeausefulcontrolvariable.Twoapproaches
areimplementedintheDEmoduletohandlethisissue,configurableinAdvancedParameters:
1. Anoverlappingcellsmetric:Usedto identifygeneswhichmaybeexpressedwithinspecificcelltypesdue
primarilytooverlappingcellsorsegmentationerrors,andexcludethemfromcell-typespecificDEanalysis.The
metriccomputestheaverageexpressionin theselectedcelltype,andtheaverageexpressionin spatial
neighborsoftheselectedcelltype(onlyconsidering"other"celltypes).Theratioofthesetwoaverageexpression
vectorsisaquickandusefulwaytodiscardimplausiblegenes.
2. Covariateadjustment:Computethetotalexpressionof thegeneof interestinthespatialneighborsof the
selectedcelltype(onlyconsidering"other"celltypes),andincludethisasacontrolvariableintheregression
model.
Forexample,ifanalyzingaparticularcelltypelikeTcells,andsomegeneishighlyexpressedinneighboringcellsof
"other"celltypes(non-Tcells),thenitmaybeprudenttoexcludethatgenefromanalysisbecauseitmaybeproneto
falseexpressionfromoverlappingorimprecisesegmentation.Thismoduleallowstheresearcherto configure
settingstoadjustDEanalysisaccordingly.
It isrecommendedtoanalyzetherateofgeneexpressionpercelllibrarysize(totaltranscriptcountsacrossall
genes)usingeither:
1. Rawcounts,usingnegativebinomialdistributionwithlibrarysizeasanoffset(thisisthedefault
optionintheDEmoduleinAtoMxSIP),or
2. Normalizedcellexpression(wherenormalizationmethodtakesintoaccountthecelllibrarysize),
usingalinear(mixed)model/ gaussiandistribution.Moreinformationonnormalizationisinthe
CosMxAnalysisScratchSpacepostonQCandNormalizationofRNAData.
66 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 67]
Custom pipeline module name (optional): Used to identify module results in downstream analysis; see full
description on page 47.
Input parameters:
Basic:
l Flow cell or biological sample annotation. Default selection: Run_Tissue_name (when determining the neighbors
of a cell in physical space, this parameter ensures that cells with similar spatial coordinates are not considered
'neighbors' unless they come from the same flow cell/sample (or other trait as selected from the dropdown
menu)).
l Filter Cells by metadata: In v2.1, this module now supports multiple metadata-based cell filters. Some examples
are listed here:
l To run DE on a particular InSituType cell type, click Add Filter, then select RNA_Cell_Typing_InSituType
from the Include cells dropdown. The Analyze cells dropdown populates with the cell types. Select the
desired cell types to analyze.
l To filter based on QC flags related to cell counts, click Add Filter, then select qcFlagsCellCountsfrom the
Include cellsdropdown . From the Analyze cellsdropdown, select the inclusion criteria, such as Pass or Fail.
l If the Sample Metadata .csv file has been edited using the GetSampleMetadata and UpdateSampleMetadata
custom modules to include a new column of annotations, that column name may be selected here to filter on
that criteria. For example, if a new column was added such as "tissue layer", with each row of the column
containing "crypt", "submucosa", or blank, select "tissue layer" in this dropdown menu to filter based on this
parameter. In the Analyze cellsdropdown , select the subgroup you wish to include in the DE analysis.
l Genes/Proteins (multi-select; leave blank to analyze all genes. If selecting individual genes/proteins to analyze
on, the software limits to 200 items).
l Distribution family (select nbinom2, gaussian, or poisson - the parametric family to be used for the regression
model).
l Variable to use for Volcano Plot. Select one variable of interest. Summary outputs will be generated in .csv file
format for all models in DE variable, which can be downloaded and used to create additional figures and plots as
needed.
l Edit the Model Formula, if necessary. The term "otherct_expr" ("other cell type expression") is a default covariate,
which is calculated as the expression of the analyzed gene inneighboring cells of other annotation types. See the
Advanced tab for more options related to this covariate. Do not select cell_ID or CellID in the Model Formula or an
error may occur.
Advanced:
l Neighbor expression category (select non-numerical variable of interest). For "otherct_expr" default covariate;
metadata annotation (usually cell type) used to compute the expression of each gene inneighboring cells of other
annotation groups. For an example cell with annotation 'a', compute the neighbor expression of each gene in
neighbor cells which are not of annotation 'a'.
l Neighbor bandwidth (default 50 microns; range 1-100). For "otherct_expr" default covariate; upper limit for
distance at which to consider a neighboring cell a "neighbor".
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 67
MAN-10162-11 CosMx SMI Data Analysis User Manual
CosMx SMI Pipeline Modules
[Page 68]
l Maxoverlapratiometric(default1).Foreachannotationin"NeighborExpressionCategory"(seeabovebullet),
computetheaverageexpressionofeachgeneintheneighborsofcellsofthatcategory(amongneighborswhich
arenotthesamecategory)andcomputetheaverageexpressionofeachgenewithintheannotation.
Overlapratioisthendefinedastheratio:
(AvgExpressioninCellsofOtherAnnotations)/(AvgExpressionincellsofAnnotation)
Ratios>1indicatehigherexpressionincellsofothercategoriesthantheindexedcategory,andhenceDEgenes
aremorelikelytobespuriouslyassociatedduetosegmentationuncertainty.Bydefault,1isthecutoffusedfor
includingageneintheanalysis.Settoalargenumbertoremovefiltering.
l Neighborexpressionweightedbydistance(selectNoneorWeight).For"otherct_expr"defaultcovariate;Should
neighborexpressionofthegeneinneighboringcellsbeweightedbydistance?"Weight"correspondstoyes,
Nonecorrespondstoequalweightsregardlessofdistance.
l Methodtoaggregateneighborexpression(selectMeanorSum).For"otherct_expr"defaultcovariate;Should
neighborexpressionofthegeneinneighboringcellsofothertypesbesummed(sum)oraveraged(mean)?
l Normalizeneighborexpression(selectTrue[recommended]or False).For"otherct_expr"defaultcovariate;
Shouldneighborexpressionofthegeneinneighboringcellsofothertypesbenormalizedbythetotalcounts?
(Thisisrecommended).
Outputvisualizations:Volcanoplot(log2foldchanges(x)against-log10(p-values)(y)).Toaccessthevolcanoplot,
clicktheimageicon
onthesuccessfully-executedDEmoduleinthePipelineStructurepanel(seeCosMxSMI
DataVisualizationsonpage70).Datatablessummarizingtheresultsarealsosavedtothetiledbdataset.Tocreatea
heatmapfromDEdata,pleaserefertoinstructionsintheCosMxSMI HumanLiverFFPEDatasetVignette.
Novae-RNA(SpatialDiscoverystudiesonly)
Prerequisitemodules:IntegratedintothefoundationalanalysispipelineinaSpatialDiscoverystudy;notavailable
toaddtoacustompipelineinaclassicconfigurationstudy.
Moduledescription:NewinAtoMxSIPv2.2,theNovaemoduleintegratesthepre-trainedfoundationmodelsfor
spatialdomaindiscoveryinspatialtranscriptomicsdatasetsviatheNovaealgorithm(seeAppendixI: Literature
Referencesonpage86). UnliketheexistingSpatialCluster(NeighborhoodAnalysis)modulethatis basedon
neighborhoodcell-typecomposition,theNovaealgorithmlearnsspatialdomainsfromexpressionandspatial
adjacencyusingagraphneuralnetworkwithself-supervisedcontrastlearningandthusenableslabel-freediscovery
ofspatialdomainswithouttheneedforpriorcell-typeannotation.Themodelsweretrainedwithmillionsofcells
acrossdiversetissuetypesandcanidentifyspatialdomainscapturingtissuearchitectureandheterogeneityeven
withzero-shotinference.
IftheNovaemoduledoesnotexecutesuccessfully,it canbere-triedbyclickingtheplaybuttononthemodulein
thePipelineStructurepanel.Whenthemodulecompletessuccessfully,reruntheSpatialDiscoverymoduleto
obtainupdatedresults.IftheNovaemodulehasnotexecutedsuccessfully,thentheSpatialDiscoveryviewreflects
resultsfromtheNeighborhoodAnalysismoduleforitscoloring.
ProcessingtimeforthismoduleishighlydependentonAtoMxSIPhubtraffic,soitisdifficulttoestimateruntime.
Forreference,aWTXstudywith1.6millioncellsrequired4.25hoursfortheNovaemoduletocomplete.
Inputparameters:Themodulerunsondefaultvalues,soitisnotconfigurableatthistime.
68 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
CosMxSMIPipelineModules
MAN-10162-11
[Page 69]
Output visualizations: Spatial Discovery image view with cell coloring by niche (see Spatial Discovery View: Data
Overlay on page 29). If the Novae module does not execute successfully, then the results of Neighborhood
Analysis will be used to color cells in the default Spatial Discovery view.
Spatial Discovery - RNA (Spatial Discovery studies only)
Prerequisite modules: Integrated into the foundational analysis pipeline in a Spatial Discovery study; not available
to add to a custom pipeline in a classic configuration study.
Module description: New in AtoMx SIP v2.2, the Spatial Discovery module leverages results from upstream
modules in the foundational data analysis pipeline to generate a downloadable Spatial Discovery data package. This
data package includes plain text files containing marker gene summaries and pathway summaries faceted by
Leiden cluster and spatial domain (or, in the event no Novae results are available, neighborhood niche assignments).
These files can be further analyzed in your statistical language of choice (e.g., R, python) as well as uploaded to
your favorite Large Language Model (LLM) for continued conversational exploration of your CosMx SMI results.
Input parameters: The module runs on default values, so it is not configurable at this time.
Output: Module creates the downloadable Spatial Discovery package, which can be used as input to Large
Language Models for further data exploration The Download icon ( ) on the Spatial Discovery module in the
Pipeline Structure panel downloads the same package as the Spatial Discovery button in the Data Overlay panel
(described on page 29).
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 69
MAN-10162-11 CosMx SMI Data Analysis User Manual
CosMx SMI Pipeline Modules
[Page 70]
CosMx SMI Data Visualizations
Study Statistics Table
Displays the Number of FOV, Mean transcripts per cell,
Mean unique genes per cell, Number of non-empty cells,
10th percentile transcript per cell, 90th percentile
transcript per cell, and Mean Negprobe counts per cell
(Figure
44; see alsoTable4 on page 46). Click thearrow
(carat) to expand the list of FOV in the selected flow cell.
Available for all studies based on Initial Data.
QC Metrics Table
Displays the pass/fail metrics for the QC module run
(Figure 45). RNA QC metrics are defined inTable5 on
page 49and Protein QC metrics are defined inTable6 on
page 50.
Figure 45: Pipeline Data panel - QC metrics
Figure 44: Study statistics table in Pipeline Data
panel
70 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
CosMx SMI Data Visualizations
MAN-10162-11
[Page 71]
XY Plot
Displays data output of the selected module in X,Y space (Figure 46).
Available for the modules QC, Normalization, Cell Typing (InSituType), Cell
Typing (CELESTA), Leiden Clustering, Neighborhood Analysis, and
Pathway Analysis.
From the dropdown menus, select the FOV, color codingmethod
(count, sum, or average), and theclustering step to plot. Click the arrow
(carat) to display more customizations, including the specificgene(s) to
plot, cell type(s), density, honeycomb, or scatter view; and tools such
as area selection and zoom. Optionally, enable automatic scatter view
when the number of points displayed is less than 10,000. Choose from
availablecolor palettes from the dropdown menu.
Certain data from an XY Plot can also be overlaid on the tissue itself in the
Image Viewer panel. SeeRecommended
Data Overlays and Interactivity
using the ImageViewer Panelon page 33. Figure 46: XY plot with honeycomb view
in Pipeline Data panel
Heatmap
Figure 47: Heatmap visualization of
QC module data
Displays data output of the selected module as a heatmap, sorting by FOV
and targets (Figure
47). Available for the modules QC, Normalization, Cell
Typing (InSituType), Identify Marker Genes, Ligand-Receptor Analysis,
and Cell Type Colocalization.
The heatmap header options will differ depending on the heatmap and the
module it represents.
For heatmaps from modules such as QC or Normalization, select theFOV
to visualize from the dropdown menu. Use the buttons to select Linear or
Log2 scaled data; display therow and/or column names; and choose the
fit of the data displayed in the panel. Click the arrow (carat) for more
options, including Zoom, Save, and adjustments to the color palette.
For heatmaps from modules such as Identify Marker Genes, toggle
between displaying All Genes or Top Markers. If Top Markers, select
the number of markers to display using the slider bar. You may select
additional markers to display (even if not a top marker) from theGenes
dropdown list.
Heatmap name and axis names can also be edited.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 71
MAN-10162-11 CosMx SMI Data Analysis User Manual
CosMx SMI Data Visualizations
[Page 72]
Box Plot and Violin Plot
Displays data output of the selected module as a box-and-whisker or violin
plot (Figure 48). (Violin plot only available for the Normalization module.)
Select FOV(s) for data visualization from the dropdown menu. Use the
buttons in the box plot header to selectLinear, Log2, or Log10scaled
data; andhide/ displaypoints.Click the arrow (carat) for more options,
including a toggle between box, violin, or combination display; minimum
expression value threshold; and custom plot title and axis names. To
export box plot data, click the Save icon
in the top right of the chart.
Figure 48: Pipeline Data panel - box plot
Histogram
Figure 49: Pipeline Data panel -
histogram
Displays the Number of Cells (y-axis) with a particular Counts per Cell
value (x-axis) (Figure 49).
The histogram header options will differ depending on the heatmap and
the module it represents. Available for the modules QC, Normalization,
and Expression Model (CELESTA).
For histograms from modules such as QC or Normalization (Figure
49),
select genes of interest from the dropdown menu. If cell typing has been
performed, select certain cell types under the second dropdown menu. If
desired, adjust the bins number (how many categories the x-axis data is
sorted into). Choose between Linear, Log 2, or Log 10 scaling. Click the
arrow (carat) to rename the axes and change the bin color and opacity.
Click the Save icon
in the top right of the chart to export histogram
data.
72 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
CosMx SMI Data Visualizations
MAN-10162-11
[Page 73]
PCA Plot
Displays the 2D representation of Principal Component Analysis
as a scatter plot with default axes Principal Component 1 (PCA_
1) and Principal Component 2 (PCA_2) (Figure
50). Select
alternative axes from the Components dropdown menu. If cell
typing has been performed, select particular cell types from the
third dropdown menu.
Click the arrow (carat) to access the selection tools.
Figure 50: Pipeline data panel - PCA
UMAP Plot
Displays the UMAP analysis for all FOVs and flow cells in the
study as a scatter plot (Figure 51).
Select aColor by option from the dropdown menu: morphology
marker expression, flow cell, total counts, or (depending on
which modules have been run) Leiden clusters, cell types, or
spatial neighborhoods or pathways. Selectcell types/ clusters
to plot from the dropdown menu.
Toggle Enable selectionto on to allow the selection of data
point(s) in the graph using a lasso, square, circle, or rectangle
annotation tool. The tools appear as icons on the top-right of the
UMAP (you may need to hide the header to see them).
Click the arrow (carat) to access additional visualization settings.
These settings are based on the concept oftiles, which can be
thought of as ann x n grid that make up the display. Select the
data reduction method (see the gray box, below). Adjust the
Tile Count (number of tiles comprising the display), Tile
Capacity, Max Data Points, Points size, and Points
Transparency, as desired. These are questions of aesthetics and
personal preference.
Figure 51: Pipeline Data panel - UMAP
Data reduction is required because there are not enough pixels on any screen to display the very large number of
points making up the UMAP. The method of data reduction can impact the shape of the UMAP and conclusions
drawn from it. Therefore, control over the method of data reduction is left to the user. Normalization method
normalizes data in each tile of the display; saturation method sets a maximum number of dots per tile. It may be
a matter of aesthetic preference for the user.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 73
MAN-10162-11 CosMx SMI Data Analysis User Manual
CosMx SMI Data Visualizations
[Page 74]
Volcano Plot
Displays the results of the Differential Expression
module by plotting log2 fold changes on the x-axis
against -log10(p-values) on the y-axis.
Access the volcano plot visualization as an HTML file
after successful execution of the Differential
Expression module. Click on the image icon
on the
module in the Pipeline Structure panel to download
the file.
Figure 52: Example of a volcano plot showing results of
Differential Expression analysis.
Flightpath Plot
Illustrates the tendency of different cell types to be
confused with each other ( Figure
53). This plot type
displays cells in groups as a function of their
probability of being a particular cell type. Each cell type
is given a centroid, and placed near other cell types
with similar profiles. Then, each individual cell is
placed based on its probability of belonging to each
centroid. For example, cells with 100% confidence are
placed directly atop their centroid, and a cell with 50%
confidence in two cell types will be placed directly
between their centroids.
Figure 53: Example of a flightpath plot. Dots between
centroids represent cells with ambiguous identity.
Access the flightpath plot as a .PNG file after
successful execution of the Cell Typing (InSituType)
module. Click on the image icon
on the module in the Pipeline Structure panel to
download the file.
Flightpath plots generated by the Cell Typing (InSituType) module are labeled with cluster identification (a, b, c...)
and confidence score (average of the cluster's cells' probability of belonging to that cluster). Clusters can be
renamed using the Cell Type QC module (see sectionCell
Type QC - RNA on page 60).
74 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
CosMx SMI Data Visualizations
MAN-10162-11
[Page 75]
Save a Visualization
If you modify a visualization's default settings, then
navigate away from the visualization, the software will
prompt you to save the visualization settings (Figure
54).
Click Cancelto navigate away without saving the
settings.
To save the settings, enter a settings name and click
Save. Once saved, the visualization is available from the
dropdown menu at the top of the Pipeline Data panel
(Figure
55).
Figure 55: Saved visualization settings available in dropdown
menu
Figure 54: Save visualization settings prompt
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 75
MAN-10162-11 CosMx SMI Data Analysis User Manual
Save a Visualization
[Page 76]
Export Images
Open the Image Viewer panel and select the Export tab from the
Image Viewer menu (Figure 56). Select from options to export the
full image or the on-screen view, and customize the appearance,
format, and quality. Individual image layers can be selected or
deselected. Cell segmentation in the export is based on its
visibility in the Image Viewer. To exclude the Preview scan from
the export, disable (unselect) the channels for this layer on the
right side of the Image Viewer.
Figure 56: Export Images from the Image Viewer
Exported images are downloaded directly, and a notification with
link appears in the AtoMx SIP notifications pane. Please note that
the link from the notifications pane expires 6 hours from the time
of download.
External users may not export images or data.
If exporting file type .jpg results in an error, exclude the scalebar by unchecking its box (Figure 56) and try again.
76 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Export Images
MAN-10162-11
[Page 77]
Export Data
The built-in export function exports decoded data files, Seurat object(s), corresponding TileDB array, and/or flat .csv
files. The decoded data comprises transcript counts and locations, annotation metadata, and user-initiated data
transformations performed in AtoMx SIP prior to export. All results up to the point of export will be available in the
Seurat object and TileDB array. While the RNA and Protein studies share the same format, the structure of the
Additional Files folder will vary based on the analyte. Data is exported on a per-study basis (Seurat objects are split
by flow cell).
Beginning in AtoMx SIP v2.0, exported data can be directed to an AWS S3 bucket or a tenant-specific host location
which is accessed by sFTP client to download data locally. Data is retained in the tenant-specific host location for 2
weeks from export. In AtoMx SIP v2.0, both methods of data export generate an md5sum file for checking the
integrity of the export job.
External users may not export images or data.
To export data,
1. In the study of interest, click Export from the Study Details Panel
(Figure 57) to launch the Export Dataset dialog.
Figure 57: Export button in Study
Details panel
Figure 58: Export Dataset dialog, Input
Parameters tab
2. In the tab Input Parameters, select the files and/or objects to export
(Figure 58). Refer to Table 8 for file descriptions. To prevent the
duplication of files and associated costs, export decoded files only
once per study. If desired, enter a value for the maximum allowed
export timeout duration (4 - 96 hr; default value is 48 hr; the export will
stop if it exceeds the duration selected).
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 77
MAN-10162-11 CosMx SMI Data Analysis User Manual
Export Images and Data
[Page 78]
Flat CSV Files
Count matrix Count matrix file (cells with gene counts)
Cell metadata Cell metadata file (local (within the FOV)/global (within the flow cell) cell coordinates,
morphology marker intensities)
Transcript Global transcripts file (global coordinates for every individual transcript on the flow
cell; not applicable for protein studies)
Polygons Global cell boundaries file (global coordinates for the vertices of the polygon that
represents the cells' boundaries - a representation of the cell segmentation)
FOV positions Global FOV position file (coordinates of the top left of each FOV)
Tertiary Analysis Objects
Seurat object(s) Seurat object(s) comprised of counts, metadata, and dimensional reduction outputs.
One Seurat object is exported for each flow cell in the study.
Transcript coordinates The exported Seurat object(s) will include transcript coordinates
Polygon coordinates The exported Seurat object(s) will include polygon (cell segmentation) coordinates
TileDB array Default tertiary analysis structure
Additional Files
(WARNING! Large data. Including these files will significantly increase export folder size.)
Additional Files Not recommended to export - large data.
Morphology2D folder Morphology images - large data.
Other miscellaneous data
files
If available - large data.
Table 8: Files for export
3. In the Export Access tab of the Export Dataset dialog, select sFTP or S3 (Figure 59).
l For sFTP access, copy the credentials and Output Folder Name provided.
l For S3 access, provide the S3 file path in the format s3://bucket/object/, AWS keys which have write
capabilities to this S3 bucket, AWS region in the format us-west-2, and session token (if configured). Refer to
Appendix
III: Setupto Export Data to an AWS S3Bucket on page 90 for more information. Please note that
large exports may exceed the 12-hour limit of AWS session tokens.
78 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Export Images and Data
MAN-10162-11
[Page 79]
4. Click Export. Export progress is updated in the Study Details panel. Once
complete, view and download export job logs from the link Show Export
Details.
Figure 59: Export Dataset dialog, Export
Access tab5. To access data exported to the tenant-specific host location using a
secure file transfer protocol (sFTP) client:
a. If you don't yet have an sFTP client, download one
such asWinSCP
. Consult with your institution's IT
team to ensure compliance with internal policies.
b. Open program and clickNew Site (Figure 60).
Steps may differ if using a program other than
WinSCP.
c. Enter the following information and click Login.
l Host name: copy exactly from Export Dataset
dialog (Figure
58).
l Port number: 22
Figure 60: WinSCP connection to exported data
l Username and password: copy username exactly
(case-sensitive) from Export Dataset dialog (Figure
58 ) in the format username@ tenantname.com.
Password is the one used to access AtoMx SIP.
d. Once connected, the available exported files are displayed in the right pane (Figure 61). Download specific
studies by selecting them and clickingDownload, or move folders or files to a local folder selected in the right
pane. Refer toExport Data on page 77 for details about output.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 79
MAN-10162-11 CosMx SMI Data Analysis User Manual
Export Images and Data
[Page 80]
Figure 61: Exported data in WinSCP window
Data are retained in the tenant-specific host location for 2 weeks from export.
Please be aware of these considerations when exporting data or troubleshooting data export:
l If export to S3 bucket fails, confirm that the AWS credentials used have Write permissions for the bucket.
l If receiving the error "Previous export failed" and the export details indicate "Pod terminated preemptively", or the
export job logs indicate "ExpiredToken", the export job exceeded the timeout duration set by the user or set by
AWS for S3 sessions (12 hours). Please break up the export into smaller jobs by reducing the number of flow
cells in the study or selecting fewer files to export from the Export Dataset dialog (Figure
58).
l If md5sum checksum values don't match, run the export job again. If the values still don't match, contact
support.spatial@ bruker.com
for support.
l An alternative method of downloading study data from AtoMx SIP is to use the CosMxDataDownloader, available
upon request. This Python application requires proficiency with command line. Contact
support.spatial@ bruker.com
for more information.
l If the study is too large to reliably include transcripts to the Seurat object, a transcript flat file is generated instead.
l For data exported for access by sFTP, Org Admins can see previous export activity of all users, but export output
is sent only to the sFTP folder of the user who initiated the export.
80 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Export Images and Data
MAN-10162-11
[Page 81]
Working with Exported Data
These computational packages are required to interact with exported data objects:
l Seurat: R Toolkit for Single Cell Genomics. Install in RStudio usinginstall.packages("Seurat")
l tiledbsc: an R implementation of the Stack of Matrices, Annotated (SOMA). Install in RStudio using
remotes::install_github("tiledb-inc/tiledbsc", force = TRUE)
l tiledbr: an R interface to the storage engine of TileDB. Install in RStudio using remotes::install_github
("TileDB-Inc/TileDB-R", force = TRUE)
For additional resources on analyzing CosMx SMI data outside of AtoMx SIP, please refer to the Biostats CosMx
Analysis Scratch Space and accompanying blog, hosted on Github. For a detailed walk-through of exported CosMx
SMI data, see theCosMx SMI Liver Dataset vignette.
The following diagrams illustrate the structure of data
exported using the Export function. (Please note that the
structure is subject to change, so this information is
provided as general guidance.) The input parameters
selected in the Export button dialog for this example are
shown in Figure
62. (For large studies containing multiple
flow cells, it is not recommended to select all files for
download, as depicted in the figure. Bin the exports or
reduce the number of flow cells in the study.)
Figure 62: Input parameters for the exported data shown in
subsequent figures
Figure
63 shows the second-level directory tree of the
export folder for an RNA flow cell named "6K_TMA". The
TileDB folder is the exported TileDB array. The tertiary
analysis objects (Seurat objects) are in the root folder. One
Seurat object is exported for each flow cell in the study.
Figure 63: Second level directory tree of
export folder of an RNA study. See
subsequent figures to expand the
DecodedFiles and flatFiles folders.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 81
MAN-10162-11 CosMx SMI Data Analysis User Manual
Export Images and Data
[Page 82]
Figure64expandsthedirectorytothefourthleveloftheDecodedFilesfolder.
l TheAnalysisResultsandAnalysisResultsArchivedfolderscontainlogfilesperFOVoftargetcallandrunlogs.
l TheCellCompositefoldercontains5-channelcompositeimagesperFOV.CellCompositeimagesareavailablein
ControlCenterv1.4.1/ AtoMxv1.3.2.4andlater,butnotinthepreviousControl CenterorAtoMxversions.For
flowcellsrunonpreviousversions,CellCompositeimagescanbecreatedfromMorphology2Dimages(resource
onGithub).
l TheCellOverlayfoldercontainsgreyscaleimagesofeachFOVwithcellsegmentationoverlayedontheimage.
l TheFOV#folderscontaincellandcompartmentlabelimagefilesdisplayingpixelintensityvaluesforcell_ID’sand
segmentationcompartmentsandmean/maxintensityvaluesforeachmorphologymarkerandcellID.
l TheMorphology2Dfoldercontains5-channellayeredtifimagesofeachFOV.Thesefilescanbeusedtocreate
CellCompositeimages(resourceonGithub)(seeCellCompositebullet,above).
l TheRnDfoldercontainsrunsummarystatisticsof membraneandnucleisignal,membranesegment,cell
coverage,theaveragecellareaandnumberofcellsforeachFOV,ascsvfiles.
l TheSegmentationfoldercontentsareidenticaltotheoutputstructureoftheCellOverlayfolder,FOV#folder,and
RnDfolder.Thegreyscaleimages,cellandcompartmentlabelsandrunsummaryarespecifictotheappliedcell
segmentationprofile.
l TheRunSummaryfoldercontainstheexperimentalconfigurationparametersoftherunandthespatialmetrics
(cycle/reporternumber,fiducialintensityandbackground,UVcleavageefficiency,etc.).Theimagingshading
profileisstoredintheShadingfolder.TheFovTrackingfoldercontainsfilesmappingtheFOVpositiontostage
position,andtheQCDirincludestherunsummarystatusforeachFOVincludingregistrationstatus,channel
intensity,andspotqualityascsvfiles.
l TheLogsfoldercontainslogfilesoftheimagingrun.
Figure65showsthesecondleveldirectorytreeoftheflatFilesfolder.Thefilecontainsallflatfilesfordownstream
dataanalyses,assetintheinputparametersoftheExportbutton.
82 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
ExportImagesandData
MAN-10162-11
[Page 83]
Figure 64: Fourth level directory tree of DecodedFiles folder of an RNA study.
Figure 65: Second level directory tree of the flatFiles folder of an RNA study.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 83
MAN-10162-11 CosMx SMI Data Analysis User Manual
Export Images and Data
[Page 84]
Figure 66 shows the second-level directory tree of
the export folder for a protein study named
"NanoString_protein" and flow cell named "Protein".
The TileDB folder is the exported TileDB array. The
tertiary analysis objects (Seurat objects) are in the
root folder. One Seurat object is exported for each
flow cell in the study.
Figure 66: Second level directory tree of export folder of
a protein study. See subsequent figures to expand the
DecodedFiles and flatFiles folders.
Figure 67 shows the fourth level directory tree of the
DecodedFiles folder for a protein flow cell.
l The AnalysisResults and AnalysisResultsArchived
folders contain per channel stats for each FOV and
run logs.
l The CellComposite folder contains 5-channel composite images per FOV. CellComposite images are available in
Control Center v1.4.1 / AtoMx v1.3.2.4 and later, but not in the previous Control Center or AtoMx versions. For
flow cells run on previous versions, CellComposite images can be created from Morphology2D images (resource
on Github).
l The FOV# folders within CellStatsDir contains cell and compartment label image files displaying pixel intensity
values for cell_ID’s and segmentation compartments and mean/max intensity values for each morphology
marker and cell ID.
l The CellOverlay folder within CellStatsDir holds greyscale images of each FOV with cell segmentation overlayed
on the image.
l The Morphology 2D folder contains 5-channel layered tif images of each FOV. These files can be used to create
CellComposite images (resource
on Github) (see CellComposite bullet, above).
l The RnD folder contains run summary csv files for each FOV, including the percentage of cells with membrane
and nuclei signal, the average membrane segment, cell coverage, average cell area and number of cells.
l The Segmentation folder contents are identical to the output structure of the CellOverlay folder, FOV# folder, and
RnD folder. The greyscale images, cell and compartment labels and run summary are specific to the applied cell
segmentation profile.
l The RunSummary folder contains the experimental configuration parameters of the run and the spatial metrics
(cycle/reporter number, focus and x, y, z position for each channel, UV cleavage efficiency, etc.). The distortion
and imaging shading profiles are stored in the Distortion and Shading folders, respectively. The FovTracking
folder contains files mapping the FOV position to stage position.
l The Logs folder contains log files of the imaging run.
84 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Export Images and Data
MAN-10162-11
[Page 85]
Figure 67: Fourth level directory tree of DecodedFiles folder of a protein study.
Figure 68: Second level directory
tree of AnalysisResults folder
Figure 68 shows the second-level directory tree of the AnalysisResults folder. The PerCellStats folder
contains cell statistics for all channels and for each protein target in csv format. The ProteinImages folder
contains protein image files for each protein in 16 bit tiff format displaying protein expression. The ProteinMasks
folder contains mask files for each protein showing the target area.
Figure 69 shows the second level directory tree of the flatFiles folder of a protein study. The file contains all flat files
for downstream data analyses, as set in the input parameters of the Export button. See note about flat file
compression in the box on page 77.
Figure 69: Second level directory tree of
the flatFiles folder of a protein study.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 85
MAN-10162-11 CosMx SMI Data Analysis User Manual
Export Images and Data
[Page 86]
Appendix I: Literature References
These references provide additional information on the modules of the CosMx SMI Data Analysis Suite.
Cell Segmentation https://github.com/MouseLand/cellpose
https://cellpose.readthedocs.io/en/latest/#
Quality Control https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h1.htm
Normalization - RNA https://scanpy-tutorials.readthedocs.io/en/latest/tutorial_pearson_
residuals.html
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-
021-02451-7
https://satijalab.org/seurat/reference/normalizedata
UMAP https://pubmed.ncbi.nlm.nih.gov/30531897/
Cell Typing (InSituType) https://www.biorxiv.org/content/10.1101/2022.10.19.512902v1.full
Expression Model and Cell Typing
(CELESTA)
https://doi.org/10.1038/s41592-022-01498-z
Neighborhood Analysis https://pubmed.ncbi.nlm.nih.gov/32763154/
https://pubmed.ncbi.nlm.nih.gov/27818791/
Leiden Clustering https://www.nature.com/articles/s41598- 019-41695-z
Spatial Expression Analysis https://link.springer.com/article/10.1007/s101090100064
Differential Expression https://github.com/glmmTMB/glmmTMB
https://github.com/rvlenth/emmeans
https://www.nature.com/articles/s42003- 021-02146-6
Cell Type Co-Localization https://www.jstor.org/stable/2984796
https://book.spatstat.org/
Signaling Pathways https://pubmed.ncbi.nlm.nih.gov/28991892/
https://github.com/aertslab/AUCell
https://bioconductor.org/packages/release/bioc/html/AUCell.html
Pathway Analysis https://doi.org/10.1038/nmeth.4463
Novae https://www.nature.com/articles/s41592- 025-02899-6
86 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Appendix I: Literature References
MAN-10162-11
[Page 87]
AppendixII:Createa SignatureMatrixandTuningParameterFilefor
CellTyping(CELESTA)
RefertoCellTyping(CELESTA)-Proteinonpage59forinformationaboutthepurposeofthesignaturematrixand
tuningparameterfiles.Tocreateacustomsignaturematrix,
1. Download the default signature matrix (human or mouse) from Github/Nanostring-
Biostats/CelestaSignatureLibrarytouseasatemplate(fromthematrixfile'spageinGitHub,clickRawtoopen
therawdatainthebrowser.Right-clickandselectSaveAs...tosaveasa.csvfiletoyourcomputer(Figure70)).
Figure70:Click'Raw'thenright-click,SaveAs...tosavethedefaultsignaturematrixfromthe
CELESTASignatureLibraryinGitHub.
2. DonoteditthecontentsofcellA1orB1.EditColumnA'srowstolistthecelltypenamestobeassignedthrough
celltyping.EditColumnB'srowstoindicatethelineagelevelofeachcelltype,usingtheformat:ClusteringLevel_
CellTypeNumberDescendedFrom_OverallCellTypeNumber(Figure71).
Figure71:Creatingacustomsignaturematrix-lineagelevels.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 87
MAN-10162-11 CosMxSMI DataAnalysisUserManual
AppendixII:CreatingaCustomSignatureMatrix
[Page 88]
3. Editthecolumnheaders(startingatColumnC)tomatchthenamesofthemarkerswhichwilldefinethecell
typing.TheentireCosMxSMI paneldoesnotneedtobeincluded.Onlythemarkerslistedwillbeusedforcell
typing.
4. Fillinthematrixtoreflectthemarkerexpressionineachofthecelltypes(Figure72).Avalueof1indicatesthe
proteinshouldbeexpressedinthecelltype;0indicatesitshouldnotbeexpressed.Blank(orNA,ifreadintoR)
indicatestheproteinisnotconsideredinthescoringfunction.
Figure72:Creatingacustomsignaturematrix-expressionvalues.
5. Savethesignaturematrixfile,thenuploadit totheExpression(CELESTA)orCellTyping(CELESTA)module
parameters.
Tocreateacustomtuningparameterfile,
1. DownloadthedefaulttuningparameterfilefromGithub/Nanostring-Biostats/CelestaSignatureLibrarytouseasa
template(seedownloadinstructionsabove).
2. Editthetuningparameterfiletomatchthesignaturematrixitwillberunwith:thesamenumberofrowsandthe
samecelltypenamesinColumn1.
3. Next,fillinthetuningparameters(Figure73)asdescribedbelow.It'srecommendedtotune1-2markersata
time(editthefile,runCellTyping(CELESTA),evaluateresults,editthefileagain,andre-run,asneeded).
88 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
AppendixII:CreatingaCustomSignatureMatrix
MAN-10162-11
[Page 89]
Figure73:Creatingacustomtuningparameterfile.
• Adjustthehighexpressionthresholdvaluesforanchorcells(ColumnB):ifacell’sprobabilityofexpressingthe
canonicalmarker(s)forthiscelltype(definedinthesignaturematrix)isgreaterthanthisthreshold,thecellcan
becomean“ anchorcell"intheCELESTAalgorithm(seeCellTyping(CELESTA)-Proteinonpage59).Increasing
thevaluemakesthecelltypingcallsmorestringent(cellsmusthavehigherexpressionofthemarker(s)namedin
thesignaturematrixtobedesignatedananchorcell).
• Adjustthehighexpressionthresholdvaluesforindexcells(ColumnD):acell'sprobabilityofexpressingthe
canonicalmarker(s)forthiscelltype(definedinthesignaturematrix)mustbegreaterthanthisthresholdtobe
assignedthecelltype.Increasingthevaluemakesthecelltypingcallsmorestringent(cellsmusthavehigher
expressionofthemarker(s)namedinthesignaturematrixtobedesignatedasthatcelltype).
• Thelowexpressionthresholdvaluesaregenerallyrobustanddonotrequiretuning.Theyspecifythatacellmay
beconsideredaparticularcelltypeaslongasthemarkersthatshouldNOTbeexpressedinthatcelltypescore<
0.9/1.Forexample,acellmaybetypedasanimmunecellaslongasthenon-immunemarkers(asdefinedinthe
signaturematrix)arenotveryhighlyexpressed.If thenon-immunemarkersarehighlyexpressed,thereisnot
justificationtocallthecellanimmunecell.
4. Savethetuningparameterfile,thenuploadittotheCellTyping(CELESTA)moduleparameters.
If theCellTyping(CELESTA)modulefailsinthepipeline,thereasoncouldbethatthemarkernamesinthe
signaturematrixdonotexactlymatchthemarkernamesasencodedinthesoftware.Openthemodulerunlog
filebyclickingonthemetricsicononthemoduleblockinthePipelineStructurepanel,andcheckfortheerror
message"...markersnotfoundinmarker_expr_matrix."If thiserroroccured,themarkerspecifiedintheerror
messagehasa differentnamein theCosMxSMI software.Checkthedropdowngenelistsin theCosMx
SMI DataAnalysisSuitetoseewhatnamethesoftwareusesforthemarker,andchangetheappropriatecolumn
headerinthesignaturematrixtomatch.Then,re-runthepipelinestep.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 89
MAN-10162-11 CosMxSMI DataAnalysisUserManual
AppendixII:CreatingaCustomSignatureMatrix
[Page 90]
Appendix III: Setup to Export Data to an AWS S3 Bucket
The following steps are provided to supplement AWS user documentation. Refer to Getting Started with Amazon
S3 for more comprehensive instructions and technical support and/or or speak with your institution's Informatics or
IT team. AWS Free S3 5 GB plan will suffice for the transfer and storage of most studies (excluding decoded data
files). (Transfer of larger files is permitted with this plan, but will incur a modest cost.) Plan options may change;
please refer to AWS user documentation to decide the best plan for your needs.
Figure 74: Sign in as root user
1. Create an AWS account.
2. Sign in as a Root user (Figure 74).
3. Click on Services in the top left of the screen, then click Storage, then
S3 (Figure 75) .
Figure 75: Storage window: S3
90 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Appendix III: Setup to Export Data to an AWS S3 Bucket
MAN-10162-11
[Page 91]
4. Click Create bucket (Figure 76; your view may be different).
Figure 76: Create bucket button
5. Fill in Bucket name – do not use spaces, uppercase or special characters (Figure 77). Choose the AWS region
that matches the AtoMx SIP AWS (if known) (e.g. US East, EU, etc). Leave all other options on this page as the
default values. ClickCreate bucket.
Figure 77: Fill in bucket name and choose AWS region
6. Return to Buckets and note the AWS region associated with your bucket ( Figure 78). Click on the name of your
newly created bucket to access it.
Figure 78: Buckets window
7. Click Create folder to make a new destination folder for CosMx SMI data export (Figure 79).
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 91
MAN-10162-11 CosMx SMI Data Analysis User Manual
Appendix III: Setup to Export Data to an AWS S3 Bucket
[Page 92]
Figure79:Createfolderbutton
8. Fillinthefoldername(donotusespacesorspecialcharacters).SelectEncryptionkeytype:AmazonS3
managedkeys.ClickCreatefolder(Figure80).
Figure80:Enterafoldername
9. ChecktheboxtotheleftofyournewlycreatedfolderandclickCopyS3URI(theURIshouldbeintheformat:
s3://atomxtest/s3demo/)(Figure81).ThisisthedestinationS3filepathwhichyouwillinputtotheCosMx
SMI ExportDatasetdialoginAtoMxSIP.
92 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
AppendixIII: SetuptoExportDatatoanAWS S3Bucket
MAN-10162-11
[Page 93]
Figure81:CopyS3URIbutton
10. Next,youwillgenerateaccesskeysto accessthisS3bucket.ClickonServices, thenSecurity,Identity,
& Compliance,thenselectIAM(Figure82).
Figure82:IAMsettings
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 93
MAN-10162-11 CosMxSMI DataAnalysisUserManual
AppendixIII: SetuptoExportDatatoanAWS S3Bucket
[Page 94]
11. SelectUsersfromtheleftmenu(Figure83).
Figure83:SelectUsers
12. ClickAddusers(Figure84).
Figure84:Addusersbutton
13. IntheUserdetailswindow,specifyausername(Figure85).ClickNext.
Figure85:Specifyanewusername
14. IntheSetpermissionswindow,selectPermissionoptions: Adduserto group, thenclickCreategroup
(Figure86).
94 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
AppendixIII: SetuptoExportDatatoanAWS S3Bucket
MAN-10162-11
[Page 95]
Figure86:Setpermissionswindow
15. InthewindowCreateusergroup, createausergroupnamewithoutspaces(Figure87).UnderPermissions
policies,searchforS3thenchecktheboxforAmazonS3FullAccess.ClickCreateusergroup.
Figure87:Createusergroupwindow
16. YoushouldseetheUsergroupyou'vejustcreatedinthelistUsergroups(Figure88).Checktheboxtotheleft
ofthename,thenclickNext.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 95
MAN-10162-11 CosMxSMI DataAnalysisUserManual
AppendixIII: SetuptoExportDatatoanAWS S3Bucket
[Page 96]
Figure88:Checktheboxnexttotheusergroupname
17. IntheReviewandcreatewindow,confirmthatthegroupnameislistedinthePermissionsSummary(Figure
89).ClickCreateUser.
Figure89:Createuser-permissionssummary
18. Apop-upmessageindicatesthattheuseriscreatedsuccessfully(Figure90).ClickViewuser.
Figure90:Viewuserbutton
19. ClickonthetabSecuritycredentials(Figure91).
96 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
AppendixIII: SetuptoExportDatatoanAWS S3Bucket
MAN-10162-11
[Page 97]
Figure 91: Security credentials from username menu
20. Scroll down to Access keysand clickCreate access key(Figure 92).
Figure 92: Create access key button 1
21. You may be presented with a list of alternatives to access keys (Figure 93). SelectOther and clickNext.
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 97
MAN-10162-11 CosMx SMI Data Analysis User Manual
Appendix III: Setup to Export Data to an AWS S3 Bucket
Figure 93: Alternatives to access keys
[Page 98]
22. Optionally enter a description tag for the key, then clickCreate access key(Figure 94).
Figure 94: Optional description for access key
23. Copy the access key and secret access key by clicking each copy icon ( Figure 95) and pasting them in a safe
place. ClickDownload .csv file for additional security backup to document your S3 access key and secret access
key.
Figure 95: Retrieve access key credentials
24. The destination S3 file path (or S3 URI), access key, secret key, and region are then entered into the fields of the
export dataset dialog (refer to instructionson page 77). The export can be configured with session tokens as well,
although please note that large exports may exceed the 12-hour limit of AWS session tokens.
98 FOR RESEARCH USE ONLY. Not for use in diagnostic procedures.
CosMx SMI Data Analysis User Manual
Appendix III: Setup to Export Data to an AWS S3 Bucket
MAN-10162-11
[Page 99]
AppendixIV:DownloadCosMxSMIFilesFromS3BucketAfterExport
NOTE:It maybe preferableto manageyourdecodeddatain AWSratherthandownloadit to a local
environment.DownloadingfromS3(fileegress)incurscostsaccordingtoAWSpricingstructure.Pleasereferto
AWSdocumentationforadditionalsupportmanagingyourdecodeddatainAWS.
FilescanbedownloadedusingtheS3console;foldersmustbedownloadedusingcommandlineinterface(CLI).
ThefollowingstepsmaybeperformedusingAWSCLI(preferred)orAnacondaPrompt.
IMPORTANT: Beforestarting,ensurethatthelocalenvironmenthasenoughfreespacetoaccommodate
thefiles.
1. InstalltheAWSCLIscriptpackageonyourcomputer(or,ifusingAnacondaPrompt,installtheAnacondaClient).
a. NavigatetotheAWSCommandLineInterfacewebsiteanddownloadtheappropriateinstallerfromthelinks
ontheright(Figure96).
b. FollowthepromptstoinstallCLItoyourcomputer.
Figure96:DownloadAWSCLIInstaller
2. Onceinstallisfinished,openCommandPromptonWindowsbypressingtheWindowskeyonyourkeyboardand
typing‘command’(Figure97).ClicktheCommandPrompttoopen.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 99
MAN-10162-11 CosMxSMI DataAnalysisUserManual
AppendixIV:DownloadCosMxSMIFilesAfterExport
[Page 100]
Figure97:OpenCommandPrompt
3. Typeinawsconfigure(Figure98).
a. WhenaskedtoprovidetheAccessKey,enterthespecifickeyfortheIAMuser.
b. WhenaskedtoprovidetheSecretKey,enterthespecifickeyfortheIAMuser.
c. Setasregion:entertheregionoftheS3bucket,suchasus-west-1oreu-central-1.Thisinformationcan
befoundinthemainfolderofyourS3bucket.
d. Setasoutputformat:text
Figure98:InputforAWS Configurecommand
4. Typeinawss3 lstocheckwhetherthebucketisconfiguredcorrectly.Theoutputshouldshowthenameof
yourS3bucket.
5. Withinthecommandprompt,navigatetothefoldertowhichtodownloadthedecodeddata.Forexample,ifthe
targetfolderis ‘testdownload’onyourC:drive,typecd testdownload(Figure99).Thecommandcd..
navigatesbackonestepintheC:drivefolderhierarchy.
Figure99:Navigatetotargetfolder
6.WithinyourAWSS3account,navigatetothefoldertodownloadandselectit withacheckmark(Figure100).
ClickonCopyS3URItocopythefolderlocation.
100 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
AppendixIV:DownloadCosMxSMIFilesAfterExport
MAN-10162-11
[Page 101]
Figure100:CopyS3folderURI
7.Inthecommandprompt,typeawss3 syncthenpastetheURIcopiedfromtheS3folderinStep7.Following
thepastedURI,typeaspaceandthenperiod..(Figure101).
Figure101:SyncCLItoS3folder
8.ThedownloadwillbeginonceyoupressEnter.Thedurationofdownloadtimedependsonthesizeofthefiles.
Fortechnicalsupport,pleasecontactyourFieldApplicationsScientistorsupport.spatial@ bruker.com.
FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures. 101
MAN-10162-11 CosMxSMI DataAnalysisUserManual
AppendixIV:DownloadCosMxSMIFilesAfterExport
[Page 102]
TroubleshootingandTechnicalSupport
Foradditionalsupport,contactsupport.spatial@ bruker.com.
Issue Possible
Cause SuggestedActions
"Somethingwentwrong",
"Studylocked",or"SMIDA
failedtoauthorize"message
Session
timeout
LogoutofAtoMxSIPandloginagain.
Studycreationfailed Processing
issue
Evaluatelogfilesforstudycreation.ClickonDetailsandlogsinthe
StudyDetailspanel(Figure102).Logfilesaredownloadedaszipped
filestoyourlocalDownloadsfolder.Onceunzipped,theycanbe
openedinatexteditorsuchasNotepad.
Figure102:DetailsandlogslinkintheStudyDetailspanel.
Apipelinemodulefails Processing
issue
Evaluatelogfilesforanindividualmodulebyclickingthemetricsicon
onthemoduleblockinthePipelineStructurepanel,thenclickthe
downloadicon(Figure103).Logfilesaredownloadedaszippedfiles
toyourlocalDownloadsfolder.Onceunzipped,theycanbeopened
inatexteditorsuchasNotepad.
Figure103:Themetricsiconopensmodulemetrics
andtheoptiontodownloadlogfiles
Pipelinemodulefailswith
error"`[.data.frame`(obs,,
tiledb_stored_names):
undefinedcolumnsselected"
Pipelinerun
namebegins
witha
number
insteadofa
letter
Renamethepipelinerunstartingwithaletterinsteadofanumber.
Seeinstructiononpage 37.
Novaemodulefails Insufficient
threadsor
CPUcores;
hubtraffic
Themodulemaysucceedonaretry.Trybreakingupthestudyinto
smallerunitsforanalysis.Contactsupport.spatial@ bruker.comifthe
problempersists.
Table9:CosMxSMI DataAnalysistroubleshooting
102 FORRESEARCHUSEONLY.Notforuseindiagnosticprocedures.
CosMxSMI DataAnalysisUserManual
Troubleshooting
MAN-10162-11
[Page 103]
Issue Possible
Cause Suggested Actions
"Failed to rerun pipeline step
[object Object]" message
Module error May occur if re-running a custom module that directly follows Initial
Data. Please create and run a new pipeline instead of re-running the
module.
Image export results in an
error
Processing
issue
Wait 30 minutes and retry image export. If issue persists, contact
support.spatial@ bruker.com
.
JPG image export fails with
error "unable to write to
target"
Known issue
related to
JPG export
Uncheck the box to include a scalebar when selecting the objects to
export. Retry image export.
Export fails with error "Pod
terminated preemptively" (or
"ExpiredToken" in logs)
Export
exceeded
time limit of
session
Break up the export into smaller jobs by reducing the number of flow
cells in the study or selecting fewer files to export from the Export
dataset dialog (Figure
58 on page 77).
Exporting Seurat object yields
error "invalid class Seurat
object" or "all cells in
reductions must be in the
same order as the Seurat
object"
Known issue
related to
modules
PCA, UMAP,
Nearest
Neighbor
This issue will be addressed in an upcoming software release. In the
meantime, please work around the issue by exporting the flat files
and then using the Seurat package to create the Seurat object.
Difficulty with immune cell
typing
Immune cells
may pose
particular
challenges in
cell typing
Use HieraType, a hierarchical cell typing method optimized for
detection and subclustering of immune cells (described
in CosMx
Scratch Space).
Image shows black circles or
holes (Figure
104)
Fiducials
were not
prepared or
applied
properly
Contact support.spatial@ bruker.comto further troubleshoot.
Figure 104: Appearance of holes on tissue
FOR RESEARCH USE ONLY. Not for use in diagnostic procedures. 103
MAN-10162-11 CosMx SMI Data Analysis User Manual
Troubleshooting
[Page 104]
3350 Monte Villa Parkway
Bothell, WA 98021
brukerspatialbiology.com
Sales ContactsContact Us
FOR RESEARCH USE ONLY.
Not for use in diagnostic procedures.
North America: nasales.bsb@bruker.com
EMEA: emeasales.bsb@bruker.com
APAC: apacsales.bsb@bruker.com
All other regions: globalsales.bsb@bruker.com
Tel: +1 888.358.6266
Fax: +1 206.378.6288
Technical Support
support.spatial@bruker.com
Customer Service
customerservice.spatial@bruker.com
Bruker Spatial Biology, Inc.