Final analysis of Primnoa coral samples, September/October 2015, showing specific filenames Running on the Amazon EC2 server, using QIIME 1.9 System information ================== Platform: linux2 Python version: 2.7.3 (default, Aug 1 2012, 05:14:39) [GCC 4.6.3] Python executable: /usr/bin/python QIIME default reference information =================================== For details on what files are used as QIIME's default references, see here: https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2 Dependency versions =================== QIIME library version: 1.9.1 QIIME script version: 1.9.1 qiime-default-reference version: 0.1.2 NumPy version: 1.9.2 SciPy version: 0.15.1 pandas version: 0.16.1 matplotlib version: 1.4.3 biom-format version: 2.1.4 h5py version: 2.5.0 (HDF5 version: 1.8.4) qcli version: 0.1.1 pyqi version: 0.3.2 scikit-bio version: 0.2.3 PyNAST version: 1.2.2 Emperor version: 0.9.51 burrito version: 0.9.1 burrito-fillings version: 0.1.1 sortmerna version: SortMeRNA version 2.0, 29/11/2014 sumaclust version: SUMACLUST Version 1.0.00 swarm version: Swarm 1.2.19 [May 26 2015 15:28:37] gdata: Installed. > process_sff.py -i SFF/ -f -o Output_dir #Files from 454 runs 1 (prim1) containing 7 Primnoa pacifica from Alaska and 10 P. resedaeformis from Baltimore Canyon (plus other corals) #and 2 (3008) containing 10 P. resedaformis from Norfolk Canyon (plus other corals) #H8MF54001.fna H8MF54001.txt H8MF54002.qual ID2UZ7K01.fna ID2UZ7K01.txt ID2UZ7K02.qual #H8MF54001.qual H8MF54002.fna H8MF54002.txt ID2UZ7K01.qual ID2UZ7K02.fna ID2UZ7K02.txt > split_libraries.py -m Primnoa3008_map.txt -f Output_dir/ID2UZ7K01.fna,Output_dir/ID2UZ7K02.fna -q Output_dir/ID2UZ7K01.qual,Output_dir/ID2UZ7K02.qual -w 50 -g -r -l 200 -L 700 -M 1 -b 10 -n 1000000 -o split_libraries_output_3008 #Number raw input seqs 964771 #Length outside bounds of 200 and 700 48261 #Num ambiguous bases exceeds limit of 6 160 #Missing Qual Score 0 #Mean qual score below minimum of 25 14582 #Max homopolymer run exceeds limit of 6 45454 #Num mismatches in primer exceeds limit of 1: 38163 #Size of quality score window, in base pairs: 50 #Number of sequences where a low quality score window was detected: 5577 #Sequences with a low quality score were not written, -g option enabled. #Sequence length details for all sequences passing quality filters: #Raw len min/max/avg 200.0/488.0/389.2 #Wrote len min/max/avg 175.0/463.0/364.2 #Barcodes corrected/not 224/768195 #Uncorrected barcodes will not be written to the output fasta file. #Corrected barcodes will be written with the appropriate barcode category. #Corrected but unassigned sequences will not be written unless --retain_unassigned_reads is enabled. #Total valid barcodes that are not in mapping file 0 #Sequences associated with valid barcodes that are not in the mapping file will not be written. #Barcodes in mapping file #Num Samples 10 #Sample ct min/max/mean: 1175 / 10162 / 4437.90 #Sample Sequence Count Barcode #NF20Q1 10162 AGACGCACTC #RB684Q4 6996 CGTGTCTCTA #RB684Q5 6207 CTCGCGTGTC #NF12Q7 4525 ACGCTCGACA #RB684Q1 4198 AGCACTGTAG #RB684Q3 3954 ATATCGCGAG #RB687Q2 3450 TGATACGTCT #RB686Q3 2123 TCTCTATGCG #NF12Q6 1589 ACGAGTGCGT #RB684Q2 1175 ATCAGACACG #Total number seqs written 44379 > split_libraries.py -m Primnoaprim1_map.txt -f Output_dir/H8MF54001.fna,Output_dir/H8MF54002.fna -q Output_dir/H8MF54001.qual,Output_dir/H8MF54002.qual -w 50 -g -r -l 200 -L 700 -M 1 -b 10 -n 2000000 -o split_libraries_output_prim1 #Number raw input seqs 552695 #Length outside bounds of 200 and 700 57592 #Num ambiguous bases exceeds limit of 6 214 #Missing Qual Score 0 #Mean qual score below minimum of 25 3264 #Max homopolymer run exceeds limit of 6 18919 #Num mismatches in primer exceeds limit of 1: 12246 #Size of quality score window, in base pairs: 50 #Number of sequences where a low quality score window was detected: 72677 #Sequences with a low quality score were not written, -g option enabled. #Sequence length details for all sequences passing quality filters: #Raw len min/max/avg 200.0/537.0/372.4 #Wrote len min/max/avg 175.0/512.0/347.4 #Barcodes corrected/not 1297/22274 #Uncorrected barcodes will not be written to the output fasta file. #Corrected barcodes will be written with the appropriate barcode category. #Corrected but unassigned sequences will not be written unless --retain_unassigned_reads is enabled. #Total valid barcodes that are not in mapping file 0 #Sequences associated with valid barcodes that are not in the mapping file will not be written. #Barcodes in mapping file #Num Samples 16 #Sample ct min/max/mean: 2905 / 44630 / 22844.31 #Sample Sequence Count Barcode #AKPP2 44630 CATAGTAGTG #AKPP1 43859 TGATACGTCT #AK342 38575 CTCGCGTGTC #NF1Q6 36012 CGTCTAGTAC #AK325 27011 CGTGTCTCTA #NF5Q6 25063 ATCAGACACG #AKUT1 24174 TCTCTATGCG #AKPP4 22349 ATACGACGTA #NF9Q6 18599 AGCACTGTAG #NF2Q6 17607 TCTACGTAGC #NF6Q7 16977 TGTACTACTC #NF5Q7 15111 ATATCGCGAG #NF6Q6 15027 TCACGTACTA #NF9Q7 8841 ACGAGTGCGT #NF10Q6 8769 ACGCTCGACA #AKPP3 2905 CGAGAGATAC #NF10Q7 0 AGACGCACTC #Total number seqs written 365509 > cat split_libraries_output_3008/seqs.fna split_libraries_output_prim1/seqs.fna > combined.fna #Concatenates the two fasta files from the separate plates into a single file for subsequent processing > mkdir Denoised_preprocess #For some reason, the next script throws an error ("Creating temporary directory failed") if you don't create the output directory first > denoiser_preprocess.py -i Output_dir/ID2UZ7K01.txt,Output_dir/ID2UZ7K02.txt,Output_dir/H8MF54001.txt,Output_dir/H8MF54002.txt -f combined.fna -p AYTGGGYDTAAAGNG -o Denoised_preprocess #Runs first clustering phase which groups reads based on common prefixes #Removes primer sequence (-p) specified from reads before running phase 1 > denoiser.py -i Output_dir/ID2UZ7K01.txt,Output_dir/ID2UZ7K02.txt,Output_dir/H8MF54001.txt,Output_dir/H8MF54002.txt -f combined.fna -p Denoised_preprocess -o Denoised_final --titanium #Removes sequencing noise characteristic to pyrosequencing by flowgram clustering # -p flag tells it not to do Phase I preprocess, instead use output from denoiser_preprocess.py # --titanium tells it to use titanium error profile # if it throws a memory error, can try again using flag --low_memory > inflate_denoiser_output.py -c Denoised_final/centroids.fasta -s Denoised_final/singletons.fasta -f combined.fna -d Denoised_final/denoiser_mapping.txt -o inflated_seqs.fna #Inflates denoiser results so they can be passed directly to OTU picker #NOT ABUNDANCE SORTED, but confirmed w/ developers ok to use usearch61 instead of uclust #409886 : inflated_seqs.fna (Sequence lengths (mean +/- std): 359.2973 +/- 19.1991) > truncate_reverse_primer.py -f inflated_seqs.fna -m Primnoa_map.txt -o reverse_primer_removed/ #Removes reverse primer sequences, if present. Leaving them can interfere with otu picking. #Original fasta filepath: inflated_seqs.fna #Total seqs in fasta: 409886 #Mapping filepath: Primnoa_map.txt #Truncation option: truncate_only #Mismatches allowed: 2 #Total seqs written: 409886 #SampleIDs not found: 0 #Reverse primers not found: 18212 > split_sequence_file_on_sample_ids.py -i reverse_primer_removed/inflated_seqs_rev_primer_truncated.fna -o Out_countseqs/ #Splitting into individual sample files so we can count the number of sequences in each and decide if any poor runs need to be removed > cd Out_countseqs/ > count_seqs.py -i "*.fasta" #Prints to screen a list of the number of sequence files in each sample, so we can ID poor runs for removal before rarefaction #1175 : RB684Q2.fasta (Sequence lengths (mean +/- std): 349.5566 +/- 32.3843) #1589 : NF12Q6.fasta (Sequence lengths (mean +/- std): 289.3820 +/- 65.7564) #2123 : RB686Q3.fasta (Sequence lengths (mean +/- std): 333.0372 +/- 14.3262) #2905 : AKPP3.fasta (Sequence lengths (mean +/- std): 329.3198 +/- 10.8618) #3450 : RB687Q2.fasta (Sequence lengths (mean +/- std): 345.8954 +/- 29.8241) #3954 : RB684Q3.fasta (Sequence lengths (mean +/- std): 340.7726 +/- 25.3149) #4198 : RB684Q1.fasta (Sequence lengths (mean +/- std): 329.5617 +/- 40.4511) #4525 : NF12Q7.fasta (Sequence lengths (mean +/- std): 331.1876 +/- 12.7309) #6207 : RB684Q5.fasta (Sequence lengths (mean +/- std): 351.9515 +/- 23.1221) #6996 : RB684Q4.fasta (Sequence lengths (mean +/- std): 344.3622 +/- 25.7358) #8769 : NF10Q6.fasta (Sequence lengths (mean +/- std): 332.1034 +/- 11.8722) #8841 : NF9Q7.fasta (Sequence lengths (mean +/- std): 332.4022 +/- 10.0840) #10162 : NF20Q1.fasta (Sequence lengths (mean +/- std): 325.9087 +/- 23.5227) #15027 : NF6Q6.fasta (Sequence lengths (mean +/- std): 308.4805 +/- 51.3313) #15111 : NF5Q7.fasta (Sequence lengths (mean +/- std): 331.0245 +/- 11.5891) #16977 : NF6Q7.fasta (Sequence lengths (mean +/- std): 329.1873 +/- 14.6630) #17606 : NF2Q6.fasta (Sequence lengths (mean +/- std): 330.6885 +/- 8.8242) #18599 : NF9Q6.fasta (Sequence lengths (mean +/- std): 331.2556 +/- 12.3142) #22349 : AKPP4.fasta (Sequence lengths (mean +/- std): 331.0313 +/- 5.8035) #24174 : AKUT1.fasta (Sequence lengths (mean +/- std): 330.7022 +/- 8.4301) #25063 : NF5Q6.fasta (Sequence lengths (mean +/- std): 325.2797 +/- 21.9882) #27011 : AK325.fasta (Sequence lengths (mean +/- std): 325.8296 +/- 20.2151) #36012 : NF1Q6.fasta (Sequence lengths (mean +/- std): 327.7966 +/- 9.3134) #38574 : AK342.fasta (Sequence lengths (mean +/- std): 330.8569 +/- 5.9764) #43859 : AKPP1.fasta (Sequence lengths (mean +/- std): 332.2088 +/- 4.3634) #44630 : AKPP2.fasta (Sequence lengths (mean +/- std): 331.8067 +/- 2.5637) #409886 : Total > extract_seqs_by_sample_id.py -i reverse_primer_removed/inflated_seqs_rev_primer_truncated.fna -o outseqs.fasta -s RB684Q2,NF12Q6,RB686Q3 -n #Creates an fna file containing all samples sequences EXCEPT the three smallest runs that we've listed; those should be removed. #Check results by resplitting and counting #>split_sequence_file_on_sample_ids.py -i outseqs.fasta -o Out_countseqs2/ #> cd Out_countseqs2/ #> count_seqs.py -i "*.fasta" #So now outseqs.fasta is the new combined files with all the samples we want to process further #404999 : outseqs.fasta (Sequence lengths (mean +/- std): 329.9701 +/- 17.5445) > extract_seqs_by_sample_id.py -i reverse_primer_removed/inflated_seqs_rev_primer_truncated.fna -o outseqs_10K.fasta -s RB684Q2,NF12Q6,RB686Q3,AKPP3,RB687Q2,RB684Q3,RB684Q1,NF12Q7,RB684Q5,RB684Q4,NF10Q6,NF9Q7 -n #Creates an fna file containing all samples sequences EXCEPT the runs that we've listed (those <10,000 seq); those should be removed. #This removes all Norfolk Canyon samples from the data set but we may want to look at this later to get a deeper view of Baltimore vs. Pacific #Check results by resplitting and counting #>split_sequence_file_on_sample_ids.py -i outseqs_10K.fasta -o Out_countseqs3/ #> cd Out_countseqs3/ #> count_seqs.py -i "*.fasta" #355154 : outseqs_10K.fasta (Sequence lengths (mean +/- std): 328.9089 +/- 16.3329) > pick_open_reference_otus.py -i outseqs.fasta -m usearch61 -o usearch61_openref_Green/ -f #Only way to implement new sub-sampled open-reference clustering; Rideout et al. (2014) https://peerj.com/articles/545/ #Runs pick_otus.py, pick_rep_set.py, align_seqs.py, assign_taxonomy.py, make_otu_table.py, make_phylogeny.py #use usearch61 instead of uclust b/c better AND has chimera-checking incorporated #Alignment is performed with PyNAST, taxonomy is assigned with uclust #Singletons are removed from the OTU table as a default (--min_otu_size = 2) #Default reference database is greengenes > cd uclust_assigned_taxonomy #2174 lines in rep_set_tax_assignments.txt by wc -l #grep -c k__Bact = 1296 + grep Unassigned 878; 34 Chloroplasts; 5 mitochondria (need to remove) > cd pynast_aligned_seqs #1576 sequences in rep_set_failures.fasta; check the first 3 in BLAST to see if really bad #All really were garbage (18S, mitochondria, short), so want to move forward using the otu_table_mc2_w_tax_no_pynast_failures.biom rather than otu_table_mc2_w_tax.biom > cd usearch61_openref_Green > filter_taxa_from_otu_table.py -i otu_table_mc2_w_tax_no_pynast_failures.biom -o otu_table_final.biom -n c__Chloroplast,f__mitochondria #Removing Chloroplast and mitochondrial sequences. # > biom convert -i file.biom -o file.txt --to-tsv to make input & output biom tables viewable as text, and use > wc -l file.txt to count lines #Input 1388, output 1352 (confirmed removal of chloroplasts and mitochondria) > biom summarize_table -i otu_table_final.biom -o summary_otu_table_final.txt #Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction #Num samples: 23 #Num observations: 1350 #Total count: 325694 #Table density (fraction of non-zero values): 0.161 #Counts/sample summary: #Min: 689.0 #Max: 44282.0 #Median: 7546.000 #Mean: 14160.609 #Std. dev.: 13012.255 #Counts/sample detail: #NF12Q7: 689.0 #NF20Q1: 1099.0 #AKPP3: 2328.0 #RB687Q2: 2557.0 #RB684Q1: 2561.0 #RB684Q3: 3176.0 #NF10Q6: 4261.0 #RB684Q4: 4753.0 #NF9Q7: 5024.0 #RB684Q5: 5820.0 #NF9Q6: 6773.0 #NF6Q6: 7546.0 #NF5Q7: 11425.0 #NF6Q7: 14653.0 #NF2Q6: 17031.0 #AKPP4: 22140.0 #AK325: 22242.0 #NF1Q6: 22448.0 #AKUT1: 23068.0 #NF5Q6: 23285.0 #AK342: 35370.0 #AKPP1: 43163.0 #AKPP2: 44282.0 > filter_samples_from_otu_table.py -i otu_table_final.biom -o filtered_otu_table.biom --sample_id_fp ids.txt --negate_sample_id_fp #Created file ids.txt listing the three samples with counts below 2,500: NF12Q7, NF20Q1, AKPP3 #Script discards samples were the id is listed in ids.txt > biom summarize_table -i filtered_otu_table.biom -o summary_otu_table_filtered_final.txt #Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction #Num samples: 20 #Num observations: 1350 #Total count: 321578 #Table density (fraction of non-zero values): 0.174 #Counts/sample summary: #Min: 2557.0 #Max: 44282.0 #Median: 13039.000 #Mean: 16078.900 #Std. dev.: 12900.843 #Sample Metadata Categories: None provided #Observation Metadata Categories: taxonomy #Counts/sample detail: #RB687Q2: 2557.0 #RB684Q1: 2561.0 #RB684Q3: 3176.0 #NF10Q6: 4261.0 #RB684Q4: 4753.0 #NF9Q7: 5024.0 #RB684Q5: 5820.0 #NF9Q6: 6773.0 #NF6Q6: 7546.0 #NF5Q7: 11425.0 #NF6Q7: 14653.0 #NF2Q6: 17031.0 #AKPP4: 22140.0 #AK325: 22242.0 #NF1Q6: 22448.0 #AKUT1: 23068.0 #NF5Q6: 23285.0 #AK342: 35370.0 #AKPP1: 43163.0 #AKPP2: 44282.0 > single_rarefaction.py -i filtered_otu_table.biom -o otu_table_final_rarefied2557.biom -d 2557 #Rarefaction step to compare all samples to the depth of 2557 sequences > biom summarize_table -i otu_table_final_rarefied2557.biom -o summary_otu_table_rarefied2557.txt #Examine output to confirm all samples are now 2557 each > mkdir Alpha_Diversity_rare2557/ #Need output directory for next steps > alpha_diversity.py -i otu_table_final_rarefied2557.biom -m observed_species,observed_otus,ace,chao1,simpson_reciprocal,shannon,simpson_e -o Alpha_Diversity_rare2557/Alpha_Diversity_rare2557.txt -t rep_set.tre #more Alpha_Diversity_rare2557/Alpha_Diversity_rare2557.txt # observed_species observed_otus ace chao1 simpson_reciprocal shannon simpson_e #NF2Q6 314.0 314.0 448.65186867 414.563380282 12.625880331 5.6756240279 0.0402098099713 #AKPP2 53.0 53.0 101.291870152 86.8333333333 1.16715200942 0.709390509739 0.0220217360268 #NF9Q6 235.0 235.0 304.822207824 300.282608696 10.0850502305 5.25340748334 0.0429151073639 #NF1Q6 84.0 84.0 132.832402235 115.5 1.87549219756 1.92447929312 0.0223272880661 #RB684Q3 273.0 273.0 313.926793323 318.023809524 14.6602441786 6.12407222775 0.0537005281268 #RB684Q4 183.0 183.0 238.735125297 232.875 9.89053808953 5.00662023379 0.0540466562269 #RB687Q2 159.0 159.0 227.778077558 246.142857143 11.3948266878 4.7776642857 0.0716655766526 #AKPP1 32.0 32.0 78.3586989183 84.5 1.06039169423 0.308480258689 0.0331372404446 #AKPP4 100.0 100.0 159.723796018 145.15 3.37841102918 2.80631386303 0.0337841102918 #NF6Q7 297.0 297.0 427.074861414 402.217391304 11.0151456697 5.21618214413 0.0370880325578 #NF5Q7 207.0 207.0 291.865402331 273.5 7.26080834217 4.58198250376 0.0350763688028 #NF6Q6 219.0 219.0 270.610375448 265.5 11.4044862682 5.20890660856 0.0520752797635 #RB684Q1 178.0 178.0 242.39223404 233.617647059 10.837888731 4.75024705853 0.060887015343 #RB684Q5 241.0 241.0 294.421417052 289.372093023 6.318339204 5.21123020897 0.0262171751204 #AKUT1 98.0 98.0 160.370966277 152.473684211 1.82174215779 1.95272476934 0.0185892056918 #AK325 141.0 141.0 179.604141543 176.357142857 8.60272150375 4.23738110628 0.0610122092464 #NF10Q6 168.0 168.0 234.504513853 233.807692308 19.552706263 5.2826859802 0.116385156328 #NF5Q6 66.0 66.0 91.5051015491 91.0 1.44381513952 1.35622410708 0.0218759869624 #NF9Q7 115.0 115.0 128.054991827 128.125 7.38816086096 4.37408121565 0.0642448770518 #AK342 47.0 47.0 72.4943950575 61.25 1.44981809223 1.22366775286 0.0308471934518 > summarize_taxa_through_plots.py -o Alpha_Diversity_rare2557/taxa_summary2557/ -i otu_table_final_rarefied2557.biom #Generates everyone's favorite bar graphs of relative abundance at Phylum to Genus level > alpha_rarefaction.py -i otu_table_final_rarefied2557.biom -o Rarefaction/ -t rep_set.tre -m Primnoa_map.txt -e 2557 #Generates rarefaction curves to see if sequencing was exhaustive enough > beta_diversity.py -i otu_table_final_rarefied2557.biom -m unweighted_unifrac,weighted_unifrac,binary_sorensen_dice,bray_curtis -o compar_div_rare2557/ -t rep_set.tre #To compare weighted/uweighted and phylogenetic/taxonomic metrics > beta_significance.py -i otu_table_final_rarefied2557.biom -t rep_set.tre -s unweighted_unifrac -o unw_sig.txt #Determining if samples are statistically significantly different from each other #Default 100 monte carlo randomizations #Unweighted unifrac (so abundance doesn't matter, just presence/absence of taxa) #unweighted unifrac significance test sample 1 sample 2 p value p value (Bonferroni corrected) AK325 AK342 0.22 1.0 AK325 AKPP1 0.73 1.0 AK325 AKPP2 0.07 1.0 AK325 AKPP4 0.0 <=1.0e-02 AK325 AKUT1 0.11 1.0 AK325 NF10Q6 0.01 1.0 AK325 NF1Q6 0.21 1.0 AK325 NF2Q6 0.0 <=1.0e-02 AK325 NF5Q6 0.03 1.0 AK325 NF5Q7 0.04 1.0 AK325 NF6Q6 0.0 <=1.0e-02 AK325 NF6Q7 0.0 <=1.0e-02 AK325 NF9Q6 0.0 <=1.0e-02 AK325 NF9Q7 0.0 <=1.0e-02 AK325 RB684Q1 0.01 1.0 AK325 RB684Q3 0.0 <=1.0e-02 AK325 RB684Q4 0.0 <=1.0e-02 AK325 RB684Q5 0.0 <=1.0e-02 AK325 RB687Q2 0.03 1.0 AK342 AKPP1 0.9 1.0 AK342 AKPP2 0.96 1.0 AK342 AKPP4 0.8 1.0 AK342 AKUT1 0.93 1.0 AK342 NF10Q6 0.16 1.0 AK342 NF1Q6 0.84 1.0 AK342 NF2Q6 0.01 1.0 AK342 NF5Q6 0.01 1.0 AK342 NF5Q7 0.08 1.0 AK342 NF6Q6 0.12 1.0 AK342 NF6Q7 0.01 1.0 AK342 NF9Q6 0.07 1.0 AK342 NF9Q7 0.06 1.0 AK342 RB684Q1 0.34 1.0 AK342 RB684Q3 0.07 1.0 AK342 RB684Q4 0.14 1.0 AK342 RB684Q5 0.03 1.0 AK342 RB687Q2 0.48 1.0 AKPP1 AKPP2 0.92 1.0 AKPP1 AKPP4 1.0 1.0 AKPP1 AKUT1 0.72 1.0 AKPP1 NF10Q6 0.48 1.0 AKPP1 NF1Q6 0.75 1.0 AKPP1 NF2Q6 0.42 1.0 AKPP1 NF5Q6 0.22 1.0 AKPP1 NF5Q7 0.48 1.0 AKPP1 NF6Q6 0.85 1.0 AKPP1 NF6Q7 0.49 1.0 AKPP1 NF9Q6 0.46 1.0 AKPP1 NF9Q7 0.41 1.0 AKPP1 RB684Q1 0.48 1.0 AKPP1 RB684Q3 0.59 1.0 AKPP1 RB684Q4 0.96 1.0 AKPP1 RB684Q5 0.68 1.0 AKPP1 RB687Q2 0.87 1.0 AKPP2 AKPP4 0.44 1.0 AKPP2 AKUT1 0.24 1.0 AKPP2 NF10Q6 0.19 1.0 AKPP2 NF1Q6 0.21 1.0 AKPP2 NF2Q6 0.03 1.0 AKPP2 NF5Q6 0.13 1.0 AKPP2 NF5Q7 0.03 1.0 AKPP2 NF6Q6 0.36 1.0 AKPP2 NF6Q7 0.0 <=1.0e-02 AKPP2 NF9Q6 0.03 1.0 AKPP2 NF9Q7 0.01 1.0 AKPP2 RB684Q1 0.19 1.0 AKPP2 RB684Q3 0.02 1.0 AKPP2 RB684Q4 0.51 1.0 AKPP2 RB684Q5 0.03 1.0 AKPP2 RB687Q2 0.14 1.0 AKPP4 AKUT1 0.11 1.0 AKPP4 NF10Q6 0.0 <=1.0e-02 AKPP4 NF1Q6 0.09 1.0 AKPP4 NF2Q6 0.0 <=1.0e-02 AKPP4 NF5Q6 0.02 1.0 AKPP4 NF5Q7 0.0 <=1.0e-02 AKPP4 NF6Q6 0.01 1.0 AKPP4 NF6Q7 0.01 1.0 AKPP4 NF9Q6 0.0 <=1.0e-02 AKPP4 NF9Q7 0.0 <=1.0e-02 AKPP4 RB684Q1 0.0 <=1.0e-02 AKPP4 RB684Q3 0.0 <=1.0e-02 AKPP4 RB684Q4 0.0 <=1.0e-02 AKPP4 RB684Q5 0.0 <=1.0e-02 AKPP4 RB687Q2 0.08 1.0 AKUT1 NF10Q6 0.0 <=1.0e-02 AKUT1 NF2Q6 0.01 1.0 AKUT1 NF5Q6 0.01 1.0 AKUT1 NF5Q7 0.02 1.0 AKUT1 NF6Q6 0.01 1.0 AKUT1 NF6Q7 0.0 <=1.0e-02 AKUT1 NF9Q6 0.0 <=1.0e-02 AKUT1 NF9Q7 0.0 <=1.0e-02 AKUT1 RB684Q1 0.0 <=1.0e-02 AKUT1 RB684Q3 0.0 <=1.0e-02 AKUT1 RB684Q4 0.0 <=1.0e-02 AKUT1 RB684Q5 0.0 <=1.0e-02 AKUT1 RB687Q2 0.05 1.0 NF10Q6 NF1Q6 0.11 1.0 NF10Q6 NF2Q6 0.1 1.0 NF10Q6 NF5Q6 0.09 1.0 NF10Q6 NF5Q7 0.01 1.0 NF10Q6 NF6Q6 0.0 <=1.0e-02 NF10Q6 NF6Q7 0.0 <=1.0e-02 NF10Q6 NF9Q6 0.01 1.0 NF10Q6 NF9Q7 0.0 <=1.0e-02 NF10Q6 RB684Q1 0.17 1.0 NF10Q6 RB684Q3 0.05 1.0 NF10Q6 RB684Q4 0.04 1.0 NF10Q6 RB684Q5 0.0 <=1.0e-02 NF10Q6 RB687Q2 0.18 1.0 NF1Q6 NF2Q6 0.11 1.0 NF1Q6 NF5Q6 0.03 1.0 NF1Q6 NF5Q7 0.41 1.0 NF1Q6 NF6Q6 0.0 <=1.0e-02 NF1Q6 NF6Q7 0.02 1.0 NF1Q6 NF9Q6 0.21 1.0 NF1Q6 NF9Q7 0.16 1.0 NF1Q6 RB684Q1 0.68 1.0 NF1Q6 RB684Q3 0.09 1.0 NF1Q6 RB684Q4 0.35 1.0 NF1Q6 RB684Q5 0.04 1.0 NF1Q6 RB687Q2 0.61 1.0 NF2Q6 NF5Q6 0.38 1.0 NF2Q6 NF5Q7 0.01 1.0 NF2Q6 NF6Q6 0.0 <=1.0e-02 NF2Q6 NF6Q7 0.0 <=1.0e-02 NF2Q6 NF9Q6 0.0 <=1.0e-02 NF2Q6 NF9Q7 0.0 <=1.0e-02 NF2Q6 RB684Q1 0.03 1.0 NF2Q6 RB684Q3 0.0 <=1.0e-02 NF2Q6 RB684Q4 0.01 1.0 NF2Q6 RB684Q5 0.0 <=1.0e-02 NF2Q6 RB687Q2 0.01 1.0 NF5Q6 NF5Q7 0.08 1.0 NF5Q6 NF6Q6 0.01 1.0 NF5Q6 NF6Q7 0.0 <=1.0e-02 NF5Q6 NF9Q6 0.04 1.0 NF5Q6 NF9Q7 0.01 1.0 NF5Q6 RB684Q1 0.27 1.0 NF5Q6 RB684Q3 0.17 1.0 NF5Q6 RB684Q4 0.05 1.0 NF5Q6 RB684Q5 0.01 1.0 NF5Q6 RB687Q2 0.05 1.0 NF5Q7 NF6Q6 0.0 <=1.0e-02 NF5Q7 NF6Q7 0.0 <=1.0e-02 NF5Q7 NF9Q6 0.05 1.0 NF5Q7 NF9Q7 0.0 <=1.0e-02 NF5Q7 RB684Q1 0.12 1.0 NF5Q7 RB684Q3 0.01 1.0 NF5Q7 RB684Q4 0.1 1.0 NF5Q7 RB684Q5 0.02 1.0 NF5Q7 RB687Q2 0.07 1.0 NF6Q6 NF6Q7 0.0 <=1.0e-02 NF6Q6 NF9Q6 0.0 <=1.0e-02 NF6Q6 NF9Q7 0.0 <=1.0e-02 NF6Q6 RB684Q1 0.07 1.0 NF6Q6 RB684Q3 0.0 <=1.0e-02 NF6Q6 RB684Q4 0.0 <=1.0e-02 NF6Q6 RB684Q5 0.0 <=1.0e-02 NF6Q6 RB687Q2 0.0 <=1.0e-02 NF6Q7 NF9Q6 0.0 <=1.0e-02 NF6Q7 NF9Q7 0.0 <=1.0e-02 NF6Q7 RB684Q1 0.0 <=1.0e-02 NF6Q7 RB684Q3 0.0 <=1.0e-02 NF6Q7 RB684Q4 0.0 <=1.0e-02 NF6Q7 RB684Q5 0.0 <=1.0e-02 NF6Q7 RB687Q2 0.0 <=1.0e-02 NF9Q6 NF9Q7 0.0 <=1.0e-02 NF9Q6 RB684Q1 0.37 1.0 NF9Q6 RB684Q3 0.02 1.0 NF9Q6 RB684Q4 0.26 1.0 NF9Q6 RB684Q5 0.04 1.0 NF9Q6 RB687Q2 0.21 1.0 NF9Q7 RB684Q1 0.02 1.0 NF9Q7 RB684Q3 0.0 <=1.0e-02 NF9Q7 RB684Q4 0.0 <=1.0e-02 NF9Q7 RB684Q5 0.0 <=1.0e-02 NF9Q7 RB687Q2 0.0 <=1.0e-02 RB684Q1 RB684Q3 0.06 1.0 RB684Q1 RB684Q4 0.03 1.0 RB684Q1 RB684Q5 0.0 <=1.0e-02 RB684Q1 RB687Q2 0.06 1.0 RB684Q3 RB684Q4 0.0 <=1.0e-02 RB684Q3 RB684Q5 0.14 1.0 RB684Q3 RB687Q2 0.04 1.0 RB684Q4 RB684Q5 0.02 1.0 RB684Q4 RB687Q2 0.06 1.0 RB684Q5 RB687Q2 0.01 1.0 > beta_significance.py -i otu_table_final_rarefied2557.biom -t rep_set.tre -s weighted_unifrac -o w_sig.txt #Determining if samples are statistically significantly different from each other #Default 100 monte carlo randomizations #Weighted unifrac (so abundance matters) #weighted unifrac significance test sample 1 sample 2 p value p value (Bonferroni corrected) AK325 AK342 0.53 1.0 AK325 AKPP1 0.54 1.0 AK325 AKPP2 0.51 1.0 AK325 AKPP4 0.48 1.0 AK325 AKUT1 0.59 1.0 AK325 NF10Q6 0.26 1.0 AK325 NF1Q6 0.7 1.0 AK325 NF2Q6 0.29 1.0 AK325 NF5Q6 0.84 1.0 AK325 NF5Q7 0.48 1.0 AK325 NF6Q6 0.38 1.0 AK325 NF6Q7 0.34 1.0 AK325 NF9Q6 0.62 1.0 AK325 NF9Q7 0.79 1.0 AK325 RB684Q1 0.87 1.0 AK325 RB684Q3 0.26 1.0 AK325 RB684Q4 0.5 1.0 AK325 RB684Q5 0.28 1.0 AK325 RB687Q2 0.86 1.0 AK342 AKPP1 0.36 1.0 AK342 AKPP2 0.42 1.0 AK342 AKPP4 0.63 1.0 AK342 AKUT1 0.59 1.0 AK342 NF10Q6 0.4 1.0 AK342 NF1Q6 0.61 1.0 AK342 NF2Q6 0.62 1.0 AK342 NF5Q6 0.53 1.0 AK342 NF5Q7 0.55 1.0 AK342 NF6Q6 0.66 1.0 AK342 NF6Q7 0.45 1.0 AK342 NF9Q6 0.47 1.0 AK342 NF9Q7 0.48 1.0 AK342 RB684Q1 0.59 1.0 AK342 RB684Q3 0.4 1.0 AK342 RB684Q4 0.34 1.0 AK342 RB684Q5 0.38 1.0 AK342 RB687Q2 0.52 1.0 AKPP1 AKPP2 0.62 1.0 AKPP1 AKPP4 0.6 1.0 AKPP1 AKUT1 0.35 1.0 AKPP1 NF10Q6 0.32 1.0 AKPP1 NF1Q6 0.6 1.0 AKPP1 NF2Q6 0.52 1.0 AKPP1 NF5Q6 0.48 1.0 AKPP1 NF5Q7 0.55 1.0 AKPP1 NF6Q6 0.61 1.0 AKPP1 NF6Q7 0.48 1.0 AKPP1 NF9Q6 0.49 1.0 AKPP1 NF9Q7 0.48 1.0 AKPP1 RB684Q1 0.51 1.0 AKPP1 RB684Q3 0.45 1.0 AKPP1 RB684Q4 0.38 1.0 AKPP1 RB684Q5 0.45 1.0 AKPP1 RB687Q2 0.44 1.0 AKPP2 AKPP4 0.5 1.0 AKPP2 AKUT1 0.33 1.0 AKPP2 NF10Q6 0.38 1.0 AKPP2 NF1Q6 0.57 1.0 AKPP2 NF2Q6 0.54 1.0 AKPP2 NF5Q6 0.5 1.0 AKPP2 NF5Q7 0.58 1.0 AKPP2 NF6Q6 0.53 1.0 AKPP2 NF6Q7 0.48 1.0 AKPP2 NF9Q6 0.53 1.0 AKPP2 NF9Q7 0.47 1.0 AKPP2 RB684Q1 0.46 1.0 AKPP2 RB684Q3 0.36 1.0 AKPP2 RB684Q4 0.41 1.0 AKPP2 RB684Q5 0.35 1.0 AKPP2 RB687Q2 0.45 1.0 AKPP4 AKUT1 0.79 1.0 AKPP4 NF10Q6 0.4 1.0 AKPP4 NF1Q6 0.63 1.0 AKPP4 NF2Q6 0.48 1.0 AKPP4 NF5Q6 0.61 1.0 AKPP4 NF5Q7 0.35 1.0 AKPP4 NF6Q6 0.36 1.0 AKPP4 NF6Q7 0.41 1.0 AKPP4 NF9Q6 0.49 1.0 AKPP4 NF9Q7 0.63 1.0 AKPP4 RB684Q1 0.64 1.0 AKPP4 RB684Q3 0.35 1.0 AKPP4 RB684Q4 0.48 1.0 AKPP4 RB684Q5 0.48 1.0 AKPP4 RB687Q2 0.54 1.0 AKUT1 NF10Q6 0.47 1.0 AKUT1 NF1Q6 0.6 1.0 AKUT1 NF2Q6 0.45 1.0 AKUT1 NF5Q6 0.59 1.0 AKUT1 NF5Q7 0.58 1.0 AKUT1 NF6Q6 0.57 1.0 AKUT1 NF6Q7 0.53 1.0 AKUT1 NF9Q6 0.44 1.0 AKUT1 NF9Q7 0.45 1.0 AKUT1 RB684Q1 0.58 1.0 AKUT1 RB684Q3 0.53 1.0 AKUT1 RB684Q4 0.41 1.0 AKUT1 RB684Q5 0.36 1.0 AKUT1 RB687Q2 0.61 1.0 NF10Q6 NF1Q6 0.65 1.0 NF10Q6 NF2Q6 0.74 1.0 NF10Q6 NF5Q6 0.74 1.0 NF10Q6 NF5Q7 0.74 1.0 NF10Q6 NF6Q6 0.86 1.0 NF10Q6 NF6Q7 0.95 1.0 NF10Q6 NF9Q6 0.68 1.0 NF10Q6 NF9Q7 0.68 1.0 NF10Q6 RB684Q1 0.9 1.0 NF10Q6 RB684Q3 0.65 1.0 NF10Q6 RB684Q4 0.94 1.0 NF10Q6 RB684Q5 0.61 1.0 NF10Q6 RB687Q2 0.64 1.0 NF1Q6 NF2Q6 0.73 1.0 NF1Q6 NF5Q6 0.97 1.0 NF1Q6 NF5Q7 0.85 1.0 NF1Q6 NF6Q6 0.77 1.0 NF1Q6 NF6Q7 0.68 1.0 NF1Q6 NF9Q6 0.91 1.0 NF1Q6 NF9Q7 0.9 1.0 NF1Q6 RB684Q1 0.96 1.0 NF1Q6 RB684Q3 0.55 1.0 NF1Q6 RB684Q4 0.83 1.0 NF1Q6 RB684Q5 0.47 1.0 NF1Q6 RB687Q2 0.73 1.0 NF2Q6 NF5Q6 0.86 1.0 NF2Q6 NF5Q7 0.38 1.0 NF2Q6 NF6Q6 0.27 1.0 NF2Q6 NF6Q7 0.85 1.0 NF2Q6 NF9Q6 0.7 1.0 NF2Q6 NF9Q7 0.72 1.0 NF2Q6 RB684Q1 0.85 1.0 NF2Q6 RB684Q3 0.76 1.0 NF2Q6 RB684Q4 0.76 1.0 NF2Q6 RB684Q5 0.57 1.0 NF2Q6 RB687Q2 0.55 1.0 NF5Q6 NF5Q7 0.81 1.0 NF5Q6 NF6Q6 0.84 1.0 NF5Q6 NF6Q7 0.75 1.0 NF5Q6 NF9Q6 0.91 1.0 NF5Q6 NF9Q7 0.76 1.0 NF5Q6 RB684Q1 0.98 1.0 NF5Q6 RB684Q3 0.66 1.0 NF5Q6 RB684Q4 0.81 1.0 NF5Q6 RB684Q5 0.7 1.0 NF5Q6 RB687Q2 0.82 1.0 NF5Q7 NF6Q6 0.61 1.0 NF5Q7 NF6Q7 0.81 1.0 NF5Q7 NF9Q6 0.9 1.0 NF5Q7 NF9Q7 0.94 1.0 NF5Q7 RB684Q1 0.9 1.0 NF5Q7 RB684Q3 0.72 1.0 NF5Q7 RB684Q4 0.96 1.0 NF5Q7 RB684Q5 0.85 1.0 NF5Q7 RB687Q2 0.93 1.0 NF6Q6 NF6Q7 0.81 1.0 NF6Q6 NF9Q6 0.82 1.0 NF6Q6 NF9Q7 0.91 1.0 NF6Q6 RB684Q1 0.99 1.0 NF6Q6 RB684Q3 0.91 1.0 NF6Q6 RB684Q4 0.98 1.0 NF6Q6 RB684Q5 0.93 1.0 NF6Q6 RB687Q2 0.98 1.0 NF6Q7 NF9Q6 0.73 1.0 NF6Q7 NF9Q7 0.79 1.0 NF6Q7 RB684Q1 0.99 1.0 NF6Q7 RB684Q3 0.67 1.0 NF6Q7 RB684Q4 0.88 1.0 NF6Q7 RB684Q5 0.81 1.0 NF6Q7 RB687Q2 0.76 1.0 NF9Q6 NF9Q7 0.7 1.0 NF9Q6 RB684Q1 1.0 1.0 NF9Q6 RB684Q3 0.73 1.0 NF9Q6 RB684Q4 0.95 1.0 NF9Q6 RB684Q5 0.68 1.0 NF9Q6 RB687Q2 0.9 1.0 NF9Q7 RB684Q1 0.94 1.0 NF9Q7 RB684Q3 0.9 1.0 NF9Q7 RB684Q4 1.0 1.0 NF9Q7 RB684Q5 0.91 1.0 NF9Q7 RB687Q2 0.98 1.0 RB684Q1 RB684Q3 0.29 1.0 RB684Q1 RB684Q4 0.54 1.0 RB684Q1 RB684Q5 0.28 1.0 RB684Q1 RB687Q2 0.4 1.0 RB684Q3 RB684Q4 0.02 1.0 RB684Q3 RB684Q5 0.67 1.0 RB684Q3 RB687Q2 0.33 1.0 RB684Q4 RB684Q5 0.17 1.0 RB684Q4 RB687Q2 0.89 1.0 RB684Q5 RB687Q2 0.59 1.0 > principal_coordinates.py -i compar_div_rare2557/ -o compar_div_rare2557_PCoA/ #Conducts principal coordinate analysis on each of the beta diversity stats (e.g., unweighted unifrac) > make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_weighted_unifrac_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_WU/ #Making 2D PCoA plots based on weighted unifrac OTU table > make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_unweighted_unifrac_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_UU/ #Making 2D PCoA plots based on unweighted unifrac OTU table > make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_binary_sorensen_dice_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BSD/ #Making 2D PCoA plots based on binary sorensen dice OTU table > make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_bray_curtis_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BC/ #Making 2D PCoA plots based on bray curtis OTU table > compute_core_microbiome.py -i otu_table_final_rarefied2557.biom -o otu_table_core #Identifying the core microbiome (shared OTUs) > summarize_taxa_through_plots.py -o Core_taxa_summary2557/ -i otu_table_core/core_table_100.biom #Visualizing core as bar graphs > make_otu_heatmap.py -i Core_taxa_summary2557/core_table_100_L5.biom -o heatmap_core_family.pdf #Visualizing core through heatmap at Family level __________________________________________________________________________________________ __________________________________________________________________________________________ Going back and doing separate analysis of only the samples with >10,000 sequences > pick_open_reference_otus.py -i outseqs_10K.fasta -m usearch61 -o usearch61_openref_Green_10K -f > cd uclust_assigned_taxonomy #1929 lines in rep_set_tax_assignments.txt by wc -l #grep -c k__Bact = 1124 + grep Unassigned 805; 31 Chloroplasts; 5 mitochondria (need to remove) > cd pynast_aligned_seqs #1478 sequences in rep_set_failures.fasta; check the first 3 in BLAST to see if really bad #All really were garbage (18S, mitochondria, short), so want to move forward using the otu_table_mc2_w_tax_no_pynast_failures.biom rather than otu_table_mc2_w_tax.biom > cd usearch61_openref_Green_10K > filter_taxa_from_otu_table.py -i otu_table_mc2_w_tax_no_pynast_failures.biom -o otu_table_final.biom -n c__Chloroplast,f__mitochondria #Removing Chloroplast and mitochondrial sequences. # > biom convert -i file.biom -o file.txt --to-tsv to make input & output biom tables viewable as text, and use > wc -l file.txt to count lines #Input 1192, output 1159 (confirmed removal of chloroplasts and mitochondria) > biom summarize_table -i otu_table_final.biom -o summary_otu_table_final.txt #Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction Num samples: 14 Num observations: 1157 Total count: 294504 Table density (fraction of non-zero values): 0.205 Counts/sample summary: Min: 1096.0 Max: 44282.0 Median: 22190.500 Mean: 21036.000 Std. dev.: 12483.262 Sample Metadata Categories: None provided Observation Metadata Categories: taxonomy Counts/sample detail: NF20Q1: 1096.0 NF9Q6: 6772.0 NF6Q6: 7541.0 NF5Q7: 11422.0 NF6Q7: 14652.0 NF2Q6: 17030.0 AKPP4: 22139.0 AK325: 22242.0 NF1Q6: 22447.0 AKUT1: 23064.0 NF5Q6: 23285.0 AK342: 35370.0 AKPP1: 43162.0 AKPP2: 44282.0 > filter_samples_from_otu_table.py -i otu_table_final.biom -o filtered_otu_table.biom --sample_id_fp ids2.txt --negate_sample_id_fp #Created file ids2.txt listing the three samples with counts below 10,000: NF20Q1, NF9Q6, NF6Q6 #Script discards samples were the id is listed in ids.txt > biom summarize_table -i filtered_otu_table.biom -o summary_otu_table_filtered_final.txt #Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction Num samples: 11 Num observations: 1157 Total count: 279095 Table density (fraction of non-zero values): 0.210 Counts/sample summary: Min: 11422.0 Max: 44282.0 Median: 22447.000 Mean: 25372.273 Std. dev.: 10408.167 Sample Metadata Categories: None provided Observation Metadata Categories: taxonomy Counts/sample detail: NF5Q7: 11422.0 NF6Q7: 14652.0 NF2Q6: 17030.0 AKPP4: 22139.0 AK325: 22242.0 NF1Q6: 22447.0 AKUT1: 23064.0 NF5Q6: 23285.0 AK342: 35370.0 AKPP1: 43162.0 AKPP2: 44282.0 > single_rarefaction.py -i filtered_otu_table.biom -o otu_table_final_rarefied11422.biom -d 11422 #Rarefaction step to compare all samples to the depth of 11,422 sequences > biom summarize_table -i otu_table_final_rarefied11422.biom -o summary_otu_table_rarefied11422.txt #Examine output to confirm all samples are now 11,422 each > mkdir Alpha_Diversity_rare11422/ > alpha_diversity.py -i otu_table_final_rarefied11422.biom -m observed_otus,ace,chao1,simpson_reciprocal,shannon,simpson_e -o Alpha_Diversity_rare11422/Alpha_Diversity_rare11422.txt -t rep_set.tre #more Alpha_Diversity_rare2557/Alpha_Diversity_rare11422.txt observed_otus ace chao1 simpson_reciprocal shannon simpson_e NF2Q6 451.0 496.450394598 508.37254902 11.8855857097 5.69753801711 0.0263538485802 AKPP2 101.0 131.376641022 122.75 1.18591120003 0.808283510251 0.0117416950498 NF1Q6 165.0 230.316872428 233.875 1.87451087269 1.96655696729 0.0113606719557 AKPP1 71.0 117.356767299 109.153846154 1.06604024443 0.348417133241 0.01501465133 AKPP4 192.0 234.682478973 230.277777778 3.44859711006 2.90475710982 0.0179614432815 NF6Q7 477.0 524.313465064 526.567164179 10.8122002368 5.35961554883 0.0226670864504 NF5Q7 334.0 370.715337816 363.5 7.35321114775 4.70351212322 0.0220156022388 AKUT1 154.0 183.688219405 185.2 1.78756875542 1.92482699417 0.0116075893209 AK325 189.0 202.769712432 202.636363636 8.40272917521 4.26429987699 0.0444588845249 NF5Q6 113.0 135.758157895 127.04 1.43880268648 1.36412175706 0.012732767137 AK342 83.0 147.830779533 125.5 1.41035486193 1.16497023068 0.0169922272521 > summarize_taxa_through_plots.py -o Alpha_Diversity_rare11422/taxa_summary11422/ -i otu_table_final_rarefied11422.biom > alpha_rarefaction.py -i otu_table_final_rarefied11422.biom -o Rarefaction/ -t rep_set.tre -m Primnoa_map.txt -e 11422 > beta_diversity.py -i otu_table_final_rarefied11422.biom -m unweighted_unifrac,weighted_unifrac,binary_sorensen_dice,bray_curtis -o compar_div_rare11422/ -t rep_set.tre > beta_significance.py -i otu_table_final_rarefied11422.biom -t rep_set.tre -s unweighted_unifrac -o unw_sig.txt #unweighted unifrac significance test sample 1 sample 2 p value p value (Bonferroni corrected) AK325 AK342 0.0 <=1.0e-02 AK325 AKPP1 0.0 <=1.0e-02 AK325 AKPP2 0.0 <=1.0e-02 AK325 AKPP4 0.0 <=1.0e-02 AK325 AKUT1 0.0 <=1.0e-02 AK325 NF1Q6 0.0 <=1.0e-02 AK325 NF2Q6 0.0 <=1.0e-02 AK325 NF5Q6 0.02 1.0 AK325 NF5Q7 0.0 <=1.0e-02 AK325 NF6Q7 0.0 <=1.0e-02 AK342 AKPP1 0.64 1.0 AK342 AKPP2 0.04 1.0 AK342 AKPP4 0.03 1.0 AK342 AKUT1 0.32 1.0 AK342 NF1Q6 0.0 <=1.0e-02 AK342 NF2Q6 0.0 <=1.0e-02 AK342 NF5Q6 0.07 1.0 AK342 NF5Q7 0.02 1.0 AK342 NF6Q7 0.0 <=1.0e-02 AKPP1 AKPP2 0.79 1.0 AKPP1 AKPP4 0.82 1.0 AKPP1 AKUT1 0.15 1.0 AKPP1 NF1Q6 0.43 1.0 AKPP1 NF2Q6 0.01 0.55 AKPP1 NF5Q6 0.19 1.0 AKPP1 NF5Q7 0.37 1.0 AKPP1 NF6Q7 0.04 1.0 AKPP2 AKPP4 0.09 1.0 AKPP2 AKUT1 0.0 <=1.0e-02 AKPP2 NF1Q6 0.0 <=1.0e-02 AKPP2 NF2Q6 0.0 <=1.0e-02 AKPP2 NF5Q6 0.01 0.55 AKPP2 NF5Q7 0.0 <=1.0e-02 AKPP2 NF6Q7 0.0 <=1.0e-02 AKPP4 AKUT1 0.01 0.55 AKPP4 NF1Q6 0.0 <=1.0e-02 AKPP4 NF2Q6 0.0 <=1.0e-02 AKPP4 NF5Q6 0.12 1.0 AKPP4 NF5Q7 0.0 <=1.0e-02 AKPP4 NF6Q7 0.0 <=1.0e-02 AKUT1 NF1Q6 0.0 <=1.0e-02 AKUT1 NF2Q6 0.0 <=1.0e-02 AKUT1 NF5Q6 0.01 0.55 AKUT1 NF5Q7 0.0 <=1.0e-02 AKUT1 NF6Q7 0.0 <=1.0e-02 NF1Q6 NF2Q6 0.0 <=1.0e-02 NF1Q6 NF5Q6 0.34 1.0 NF1Q6 NF5Q7 0.0 <=1.0e-02 NF1Q6 NF6Q7 0.0 <=1.0e-02 NF2Q6 NF5Q6 0.06 1.0 NF2Q6 NF5Q7 0.0 <=1.0e-02 NF2Q6 NF6Q7 0.0 <=1.0e-02 NF5Q6 NF5Q7 0.04 1.0 NF5Q6 NF6Q7 0.0 <=1.0e-02 NF5Q7 NF6Q7 0.0 <=1.0e-02 > beta_significance.py -i otu_table_final_rarefied11422.biom -t rep_set.tre -s weighted_unifrac -o w_sig.txt #weighted unifrac significance test sample 1 sample 2 p value p value (Bonferroni corrected) AK325 AK342 0.36 1.0 AK325 AKPP1 0.3 1.0 AK325 AKPP2 0.35 1.0 AK325 AKPP4 0.2 1.0 AK325 AKUT1 0.33 1.0 AK325 NF1Q6 0.9 1.0 AK325 NF2Q6 0.26 1.0 AK325 NF5Q6 0.97 1.0 AK325 NF5Q7 0.45 1.0 AK325 NF6Q7 0.47 1.0 AK342 AKPP1 0.26 1.0 AK342 AKPP2 0.35 1.0 AK342 AKPP4 0.52 1.0 AK342 AKUT1 0.34 1.0 AK342 NF1Q6 0.63 1.0 AK342 NF2Q6 0.41 1.0 AK342 NF5Q6 0.57 1.0 AK342 NF5Q7 0.36 1.0 AK342 NF6Q7 0.46 1.0 AKPP1 AKPP2 0.36 1.0 AKPP1 AKPP4 0.5 1.0 AKPP1 AKUT1 0.23 1.0 AKPP1 NF1Q6 0.61 1.0 AKPP1 NF2Q6 0.34 1.0 AKPP1 NF5Q6 0.52 1.0 AKPP1 NF5Q7 0.39 1.0 AKPP1 NF6Q7 0.36 1.0 AKPP2 AKPP4 0.4 1.0 AKPP2 AKUT1 0.21 1.0 AKPP2 NF1Q6 0.67 1.0 AKPP2 NF2Q6 0.33 1.0 AKPP2 NF5Q6 0.52 1.0 AKPP2 NF5Q7 0.39 1.0 AKPP2 NF6Q7 0.52 1.0 AKPP4 AKUT1 0.52 1.0 AKPP4 NF1Q6 0.61 1.0 AKPP4 NF2Q6 0.27 1.0 AKPP4 NF5Q6 0.75 1.0 AKPP4 NF5Q7 0.32 1.0 AKPP4 NF6Q7 0.29 1.0 AKUT1 NF1Q6 0.69 1.0 AKUT1 NF2Q6 0.34 1.0 AKUT1 NF5Q6 0.64 1.0 AKUT1 NF5Q7 0.43 1.0 AKUT1 NF6Q7 0.3 1.0 NF1Q6 NF2Q6 0.98 1.0 NF1Q6 NF5Q6 0.99 1.0 NF1Q6 NF5Q7 0.95 1.0 NF1Q6 NF6Q7 0.88 1.0 NF2Q6 NF5Q6 0.89 1.0 NF2Q6 NF5Q7 0.44 1.0 NF2Q6 NF6Q7 0.64 1.0 NF5Q6 NF5Q7 0.97 1.0 NF5Q6 NF6Q7 0.92 1.0 NF5Q7 NF6Q7 0.85 1.0 > principal_coordinates.py -i compar_div_rare11422/ -o compar_div_rare11422_PCoA/ > make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_weighted_unifrac_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_WU/ > make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_unweighted_unifrac_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_UU/ > make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_binary_sorensen_dice_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BSD/ > make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_bray_curtis_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BC/ > compute_core_microbiome.py -i otu_table_final_rarefied11422.biom -o otu_table_core > summarize_taxa_through_plots.py -o Core_taxa_summary11422/ -i otu_table_core/core_table_100.biom > make_otu_heatmap.py -i Core_taxa_summary11422/core_table_100_L5.biom -o heatmap_core_family.pdf 1