[aroma.affymetrix] GcRmaBackgroundCorrection

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[aroma.affymetrix] GcRmaBackgroundCorrection

Krzysztof
Hello,

I know this has been addressed before, but I am still not clear about it and not sure whether it was solved. I am having trouble getting gcRMA to work with a Human Transcriptome Array 2.0. I created custom CDFs based on probes mapped to 5'utr only. Please find my R code, output/error. sessionInfo below.
Could you help me with the following questions, I would really appreciate:

1. Regarding probe_tab file required for GcRmaBackgroundCorrection, does it have to have specific column names? When I change first column name from Probe Set Name to Probe SetName I am getting completely different error (bit more reminiscent of error I read about in other post on this group from 2010 regarding GcRmaBackgroundCorrection).  

Error: Either argument 'names' or 'pattern' must be specified.
In addition: Warning message:
In readDataFrame.TabularTextFile(ptf, colClasses = c(`^(unitName|probeSetID)$` = "character"),  :
  Argument 'rows' was out of range [1,0]. Ignored rows beyond this range.

2. As I understand from your previous posts, the solution would be to download affymetrix cdf for background correction purposes. Could you explain me why it is needed and can't be done using customCDF? Also if I would follow that approach, how it would affect if only small subset of probes is used in the analysis.

Thank you

Best wishes
Krzysztof


This is my code:
# Read all the cel files that are there in your folder
cdf <- AffymetrixCdfFile$byChipType(chipType, tags='binary')
cs <- AffymetrixCelSet$byName(name, cdf=cdf, verbose=verbose)

# Read a file with cel files of experiment design - it deals with >CEL than currently being used
mt <- match(design[,1], getNames(cs))
ds <- extract(cs, mt)
setCdf(ds, cdf)

# Background correction
bc <- GcRmaBackgroundCorrection(ds,type="affinities")
csBC <- process(bc, verbose=-20)

#Here is output:
Background correcting data set...
 Background correcting data set...
  Already background corrected for "optical" effects
 Background correcting data set...done
 Computing probe affinities (independent of data)...
  Computing GCRMA probe affinities...
   Chip type: HTA2_hg19_refseq_fiveutr
   Number of units: 33640
   Locating the cell sequence annotation data file...
   Locating the cell sequence annotation data file...done
  Computing GCRMA probe affinities...done
  Computing GCRMA probe affinities...
   Number of units: 33640
   Identify PMs and MMs among the CDF cell indices...
     logi [1:822842] TRUE TRUE TRUE TRUE TRUE TRUE ...
       Mode    TRUE    NA's 
    logical  822842       0 
    MMs are defined as non-PMs
    Number of PMs: 822842
    Number of MMs: 0
   Identify PMs and MMs among the CDF cell indices...done
   Reading probe-sequence data...
    Retrieving probe-sequence data...
     Chip type (full): HTA2_hg19_refseq_fiveutr,binary
     Locating probe-tab file...
      Chip type: HTA2_hg19_refseq_fiveutr
      AffymetrixProbeTabFile:
      Name: HTA2_hg19_refseq_fiveutr
      Tags: 
      Full name: HTA2_hg19_refseq_fiveutr
      Pathname: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr/HTA2_hg19_refseq_fiveutr_probe_tab
      File size: 43.03 MiB (45125121 bytes)
      RAM: 0.01 MB
      Number of data rows: NA
      Columns [6]: 'unitName', 'probeXPos', 'probeYPos', 'interrogationPosition', 'probeSequence', 'targetStrandedness'
      Number of text lines: NA
      AffymetrixCdfFile:
      Path: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr
      Filename: HTA2_hg19_refseq_fiveutr.cdf
      File size: 16.44 MiB (17238612 bytes)
      Chip type: HTA2_hg19_refseq_fiveutr
      RAM: 0.00MB
      File format: v4 (binary; XDA)
      Dimension: 2572x2680
      Number of cells: 6892960
      Number of units: 33640
      Cells per unit: 204.90
      Number of QC units: 0
     Locating probe-tab file...done
     Validating probe-tab file against CDF...
      Number of records read: 1
      Data read:
      'data.frame': 1 obs. of  1 variable:
       $ unitName: chr "NM_000015"
      Unit name:
       chr "NM_000015"
      Unit index: 2
        probeXPos probeYPos             probeSequence
      1      2493       329 CTTCCCTTGCAGACTTTGGAAGGGA
      (x,y):
      [1] 2493  329
     Validating probe-tab file against CDF...done
     Reading (x,y,sequence) data...
     Reading (x,y,sequence) data...done
     Validating (x,y) against CDF dimension...
      CDF dimension:
         nbrOfRows nbrOfColumns 
              2572         2680 
[2016-03-31 10:22:42] Exception: Detected probe x position out of range [0,2572]: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr/HTA2_hg19_refseq_fiveutr_probe_tab

  at #08. getProbeSequenceData.AffymetrixCdfFile(this, safe = safe, verbose = verbose)
          - getProbeSequenceData.AffymetrixCdfFile() is in environment 'aroma.affymetrix'

  at #07. getProbeSequenceData(this, safe = safe, verbose = verbose)
          - getProbeSequenceData() is in environment 'aroma.affymetrix'

  at #06. computeAffinities.AffymetrixCdfFile(cdf, ..., verbose = less(verbose))
          - computeAffinities.AffymetrixCdfFile() is in environment 'aroma.affymetrix'

  at #05. computeAffinities(cdf, ..., verbose = less(verbose))
          - computeAffinities() is in environment 'aroma.affymetrix'

  at #04. calculateAffinities.GcRmaBackgroundCorrection(this, verbose = less(verbose))
          - calculateAffinities.GcRmaBackgroundCorrection() is in environment 'aroma.affymetrix'

  at #03. calculateAffinities(this, verbose = less(verbose))
          - calculateAffinities() is in environment 'aroma.affymetrix'

  at #02. process.GcRmaBackgroundCorrection(bc, verbose = -20)
          - process.GcRmaBackgroundCorrection() is in environment 'aroma.affymetrix'

  at #01. process(bc, verbose = -20)
          - process() is in environment 'aroma.core'

Error: Detected probe x position out of range [0,2572]: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr/HTA2_hg19_refseq_fiveutr_probe_tab
     Validating (x,y) against CDF dimension...done
    Retrieving probe-sequence data...done
   Reading probe-sequence data...done
  Computing GCRMA probe affinities...done
 Computing probe affinities (independent of data)...done
Background correcting data set...done

#Session info
R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.1 (El Capitan)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GO.db_3.2.2            org.Hs.eg.db_3.2.3     RSQLite_1.0.0          DBI_0.3.1              AnnotationDbi_1.32.3  
 [6] Biobase_2.30.0         Sushi_1.8.0            moments_0.14           entropy_1.2.1          zoo_1.7-12            
[11] biomaRt_2.26.1         reshape2_1.4.1         rtracklayer_1.30.3     GenomicRanges_1.22.4   GenomeInfoDb_1.6.3    
[16] IRanges_2.4.8          S4Vectors_0.8.11       BiocGenerics_0.16.1    FIRMAGene_0.9.8        dplyr_0.4.3           
[21] stringr_1.0.0          data.table_1.9.6       aroma.light_3.0.0      aroma.affymetrix_3.0.0 aroma.core_3.0.0      
[26] R.devices_2.14.0       R.filesets_2.10.0      R.utils_2.2.0          R.oo_1.20.0            affxparser_1.42.0     
[31] R.methodsS3_1.7.1     

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.3                lattice_0.20-33            listenv_0.6.0              Rsamtools_1.22.0          
 [5] Biostrings_2.38.4          assertthat_0.1             digest_0.6.9               R6_2.1.2                  
 [9] plyr_1.8.3                 chron_2.3-47               futile.options_1.0.0       R.huge_0.9.0              
[13] BiocInstaller_1.20.1       zlibbioc_1.16.0            preprocessCore_1.32.0      splines_3.2.3             
[17] BiocParallel_1.4.3         gcrma_2.42.0               RCurl_1.95-4.8             base64enc_0.1-3           
[21] aroma.apd_0.6.0            R.rsp_0.21.0               globals_0.6.1              SummarizedExperiment_1.0.2
[25] DNAcopy_1.44.0             codetools_0.2-14           matrixStats_0.50.1         XML_3.98-1.4              
[29] future_0.12.0              GenomicAlignments_1.6.3    bitops_1.0-6               grid_3.2.3                
[33] affy_1.48.0                magrittr_1.5               stringi_1.0-1              XVector_0.10.0            
[37] affyio_1.40.0              PSCBS_0.61.0               futile.logger_1.4.1        lambda.r_1.1.7            
[41] tools_3.2.3                R.cache_0.12.0   

--
--
When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example.
 
 
You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to [hidden email]
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

---
You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.