Final analysis of Primnoa coral samples, September/October 2015, showing specific filenames
Running on the Amazon EC2 server, using QIIME 1.9

System information
==================
         Platform:	linux2
   Python version:	2.7.3 (default, Aug  1 2012, 05:14:39)  [GCC 4.6.3]
Python executable:	/usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:
 https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2

Dependency versions
===================
          QIIME library version:	1.9.1
           QIIME script version:	1.9.1
qiime-default-reference version:	0.1.2
                  NumPy version:	1.9.2
                  SciPy version:	0.15.1
                 pandas version:	0.16.1
             matplotlib version:	1.4.3
            biom-format version:	2.1.4
                   h5py version:	2.5.0 (HDF5 version: 1.8.4)
                   qcli version:	0.1.1
                   pyqi version:	0.3.2
             scikit-bio version:	0.2.3
                 PyNAST version:	1.2.2
                Emperor version:	0.9.51
                burrito version:	0.9.1
       burrito-fillings version:	0.1.1
              sortmerna version:	SortMeRNA version 2.0, 29/11/2014
              sumaclust version:	SUMACLUST Version 1.0.00
                  swarm version:	Swarm 1.2.19 [May 26 2015 15:28:37]
                          gdata:	Installed.


> process_sff.py -i SFF/ -f -o Output_dir

#Files from 454 runs 1 (prim1) containing 7 Primnoa pacifica from Alaska and 10 P. resedaeformis from Baltimore Canyon (plus other corals)
#and 2 (3008) containing 10 P. resedaformis from Norfolk Canyon (plus other corals)
#H8MF54001.fna   H8MF54001.txt  H8MF54002.qual  ID2UZ7K01.fna   ID2UZ7K01.txt  ID2UZ7K02.qual
#H8MF54001.qual  H8MF54002.fna  H8MF54002.txt   ID2UZ7K01.qual  ID2UZ7K02.fna  ID2UZ7K02.txt


> split_libraries.py -m Primnoa3008_map.txt -f Output_dir/ID2UZ7K01.fna,Output_dir/ID2UZ7K02.fna -q Output_dir/ID2UZ7K01.qual,Output_dir/ID2UZ7K02.qual -w 50 -g -r -l 200 -L 700 -M 1 -b 10 -n 1000000 -o split_libraries_output_3008

#Number raw input seqs	964771
#Length outside bounds of 200 and 700	48261
#Num ambiguous bases exceeds limit of 6	160
#Missing Qual Score	0
#Mean qual score below minimum of 25	14582
#Max homopolymer run exceeds limit of 6	45454
#Num mismatches in primer exceeds limit of 1: 38163
#Size of quality score window, in base pairs: 50
#Number of sequences where a low quality score window was detected: 5577
#Sequences with a low quality score were not written, -g option enabled.

#Sequence length details for all sequences passing quality filters:
#Raw len min/max/avg	200.0/488.0/389.2
#Wrote len min/max/avg	175.0/463.0/364.2
#Barcodes corrected/not	224/768195
#Uncorrected barcodes will not be written to the output fasta file.
#Corrected barcodes will be written with the appropriate barcode category.
#Corrected but unassigned sequences will not be written unless --retain_unassigned_reads is enabled.
#Total valid barcodes that are not in mapping file	0
#Sequences associated with valid barcodes that are not in the mapping file will not be written.

#Barcodes in mapping file
#Num Samples	10
#Sample ct min/max/mean: 1175 / 10162 / 4437.90
#Sample	Sequence Count	Barcode
#NF20Q1	10162	AGACGCACTC
#RB684Q4	6996	CGTGTCTCTA
#RB684Q5	6207	CTCGCGTGTC
#NF12Q7	4525	ACGCTCGACA
#RB684Q1	4198	AGCACTGTAG
#RB684Q3	3954	ATATCGCGAG
#RB687Q2	3450	TGATACGTCT
#RB686Q3	2123	TCTCTATGCG
#NF12Q6	1589	ACGAGTGCGT
#RB684Q2	1175	ATCAGACACG
#Total number seqs written	44379


> split_libraries.py -m Primnoaprim1_map.txt -f Output_dir/H8MF54001.fna,Output_dir/H8MF54002.fna -q Output_dir/H8MF54001.qual,Output_dir/H8MF54002.qual -w 50 -g -r -l 200 -L 700 -M 1 -b 10 -n 2000000 -o split_libraries_output_prim1

#Number raw input seqs	552695
#Length outside bounds of 200 and 700	57592
#Num ambiguous bases exceeds limit of 6	214
#Missing Qual Score	0
#Mean qual score below minimum of 25	3264
#Max homopolymer run exceeds limit of 6	18919
#Num mismatches in primer exceeds limit of 1: 12246
#Size of quality score window, in base pairs: 50
#Number of sequences where a low quality score window was detected: 72677
#Sequences with a low quality score were not written, -g option enabled.

#Sequence length details for all sequences passing quality filters:
#Raw len min/max/avg	200.0/537.0/372.4
#Wrote len min/max/avg	175.0/512.0/347.4
#Barcodes corrected/not	1297/22274
#Uncorrected barcodes will not be written to the output fasta file.
#Corrected barcodes will be written with the appropriate barcode category.
#Corrected but unassigned sequences will not be written unless --retain_unassigned_reads is enabled.
#Total valid barcodes that are not in mapping file	0
#Sequences associated with valid barcodes that are not in the mapping file will not be written.

#Barcodes in mapping file
#Num Samples	16
#Sample ct min/max/mean: 2905 / 44630 / 22844.31
#Sample	Sequence Count	Barcode
#AKPP2	44630	CATAGTAGTG
#AKPP1	43859	TGATACGTCT
#AK342	38575	CTCGCGTGTC
#NF1Q6	36012	CGTCTAGTAC
#AK325	27011	CGTGTCTCTA
#NF5Q6	25063	ATCAGACACG
#AKUT1	24174	TCTCTATGCG
#AKPP4	22349	ATACGACGTA
#NF9Q6	18599	AGCACTGTAG
#NF2Q6	17607	TCTACGTAGC
#NF6Q7	16977	TGTACTACTC
#NF5Q7	15111	ATATCGCGAG
#NF6Q6	15027	TCACGTACTA
#NF9Q7	8841	ACGAGTGCGT
#NF10Q6	8769	ACGCTCGACA
#AKPP3	2905	CGAGAGATAC
#NF10Q7	0	AGACGCACTC
#Total number seqs written	365509


> cat split_libraries_output_3008/seqs.fna split_libraries_output_prim1/seqs.fna > combined.fna

#Concatenates the two fasta files from the separate plates into a single file for subsequent processing


> mkdir Denoised_preprocess

#For some reason, the next script throws an error ("Creating temporary directory failed") if you don't create the output directory first


> denoiser_preprocess.py -i Output_dir/ID2UZ7K01.txt,Output_dir/ID2UZ7K02.txt,Output_dir/H8MF54001.txt,Output_dir/H8MF54002.txt -f combined.fna -p AYTGGGYDTAAAGNG -o Denoised_preprocess

#Runs first clustering phase which groups reads based on common prefixes
#Removes primer sequence (-p) specified from reads before running phase 1


> denoiser.py -i Output_dir/ID2UZ7K01.txt,Output_dir/ID2UZ7K02.txt,Output_dir/H8MF54001.txt,Output_dir/H8MF54002.txt -f combined.fna -p Denoised_preprocess -o Denoised_final --titanium

#Removes sequencing noise characteristic to pyrosequencing by flowgram clustering
# -p flag tells it not to do Phase I preprocess, instead use output from denoiser_preprocess.py
# --titanium tells it to use titanium error profile
# if it throws a memory error, can try again using flag --low_memory

> inflate_denoiser_output.py -c Denoised_final/centroids.fasta -s Denoised_final/singletons.fasta -f combined.fna -d Denoised_final/denoiser_mapping.txt -o inflated_seqs.fna

#Inflates denoiser results so they can be passed directly to OTU picker
#NOT ABUNDANCE SORTED, but confirmed w/ developers ok to use usearch61 instead of uclust
#409886  : inflated_seqs.fna (Sequence lengths (mean +/- std): 359.2973 +/- 19.1991)


> truncate_reverse_primer.py -f inflated_seqs.fna -m Primnoa_map.txt -o reverse_primer_removed/

#Removes reverse primer sequences, if present. Leaving them can interfere with otu picking.
#Original fasta filepath: inflated_seqs.fna
#Total seqs in fasta: 409886
#Mapping filepath: Primnoa_map.txt
#Truncation option: truncate_only
#Mismatches allowed: 2
#Total seqs written: 409886
#SampleIDs not found: 0
#Reverse primers not found: 18212


> split_sequence_file_on_sample_ids.py -i reverse_primer_removed/inflated_seqs_rev_primer_truncated.fna	-o Out_countseqs/

#Splitting into individual sample files so we can count the number of sequences in each and decide if any poor runs need to be removed


> cd Out_countseqs/
> count_seqs.py -i "*.fasta"

#Prints to screen a list of the number of sequence files in each sample, so we can ID poor runs for removal before rarefaction
#1175  : RB684Q2.fasta (Sequence lengths (mean +/- std): 349.5566 +/- 32.3843)
#1589  : NF12Q6.fasta (Sequence lengths (mean +/- std): 289.3820 +/- 65.7564)
#2123  : RB686Q3.fasta (Sequence lengths (mean +/- std): 333.0372 +/- 14.3262)
#2905  : AKPP3.fasta (Sequence lengths (mean +/- std): 329.3198 +/- 10.8618)
#3450  : RB687Q2.fasta (Sequence lengths (mean +/- std): 345.8954 +/- 29.8241)
#3954  : RB684Q3.fasta (Sequence lengths (mean +/- std): 340.7726 +/- 25.3149)
#4198  : RB684Q1.fasta (Sequence lengths (mean +/- std): 329.5617 +/- 40.4511)
#4525  : NF12Q7.fasta (Sequence lengths (mean +/- std): 331.1876 +/- 12.7309)
#6207  : RB684Q5.fasta (Sequence lengths (mean +/- std): 351.9515 +/- 23.1221)
#6996  : RB684Q4.fasta (Sequence lengths (mean +/- std): 344.3622 +/- 25.7358)
#8769  : NF10Q6.fasta (Sequence lengths (mean +/- std): 332.1034 +/- 11.8722)
#8841  : NF9Q7.fasta (Sequence lengths (mean +/- std): 332.4022 +/- 10.0840)
#10162  : NF20Q1.fasta (Sequence lengths (mean +/- std): 325.9087 +/- 23.5227)
#15027  : NF6Q6.fasta (Sequence lengths (mean +/- std): 308.4805 +/- 51.3313)
#15111  : NF5Q7.fasta (Sequence lengths (mean +/- std): 331.0245 +/- 11.5891)
#16977  : NF6Q7.fasta (Sequence lengths (mean +/- std): 329.1873 +/- 14.6630)
#17606  : NF2Q6.fasta (Sequence lengths (mean +/- std): 330.6885 +/- 8.8242)
#18599  : NF9Q6.fasta (Sequence lengths (mean +/- std): 331.2556 +/- 12.3142)
#22349  : AKPP4.fasta (Sequence lengths (mean +/- std): 331.0313 +/- 5.8035)
#24174  : AKUT1.fasta (Sequence lengths (mean +/- std): 330.7022 +/- 8.4301)
#25063  : NF5Q6.fasta (Sequence lengths (mean +/- std): 325.2797 +/- 21.9882)
#27011  : AK325.fasta (Sequence lengths (mean +/- std): 325.8296 +/- 20.2151)
#36012  : NF1Q6.fasta (Sequence lengths (mean +/- std): 327.7966 +/- 9.3134)
#38574  : AK342.fasta (Sequence lengths (mean +/- std): 330.8569 +/- 5.9764)
#43859  : AKPP1.fasta (Sequence lengths (mean +/- std): 332.2088 +/- 4.3634)
#44630  : AKPP2.fasta (Sequence lengths (mean +/- std): 331.8067 +/- 2.5637)
#409886  : Total


> extract_seqs_by_sample_id.py -i reverse_primer_removed/inflated_seqs_rev_primer_truncated.fna -o outseqs.fasta -s RB684Q2,NF12Q6,RB686Q3 -n

#Creates an fna file containing all samples sequences EXCEPT the three smallest runs that we've listed; those should be removed.
#Check results by resplitting and counting
		#>split_sequence_file_on_sample_ids.py -i outseqs.fasta	-o Out_countseqs2/
		#> cd Out_countseqs2/
		#> count_seqs.py -i "*.fasta"
#So now outseqs.fasta is the new combined files with all the samples we want to process further	
#404999  : outseqs.fasta (Sequence lengths (mean +/- std): 329.9701 +/- 17.5445)


> extract_seqs_by_sample_id.py -i reverse_primer_removed/inflated_seqs_rev_primer_truncated.fna -o outseqs_10K.fasta -s RB684Q2,NF12Q6,RB686Q3,AKPP3,RB687Q2,RB684Q3,RB684Q1,NF12Q7,RB684Q5,RB684Q4,NF10Q6,NF9Q7 -n

#Creates an fna file containing all samples sequences EXCEPT the runs that we've listed (those <10,000 seq); those should be removed.
#This removes all Norfolk Canyon samples from the data set but we may want to look at this later to get a deeper view of Baltimore vs. Pacific
#Check results by resplitting and counting
		#>split_sequence_file_on_sample_ids.py -i outseqs_10K.fasta	-o Out_countseqs3/
		#> cd Out_countseqs3/
		#> count_seqs.py -i "*.fasta"
#355154  : outseqs_10K.fasta (Sequence lengths (mean +/- std): 328.9089 +/- 16.3329)


> pick_open_reference_otus.py -i outseqs.fasta -m usearch61 -o usearch61_openref_Green/ -f 

#Only way to implement new sub-sampled open-reference clustering; Rideout et al. (2014) https://peerj.com/articles/545/
#Runs pick_otus.py, pick_rep_set.py, align_seqs.py, assign_taxonomy.py, make_otu_table.py, make_phylogeny.py
#use usearch61 instead of uclust b/c better AND has chimera-checking incorporated
#Alignment is performed with PyNAST, taxonomy is assigned with uclust
#Singletons are removed from the OTU table as a default (--min_otu_size = 2)
#Default reference database is greengenes 


> cd uclust_assigned_taxonomy

#2174 lines in rep_set_tax_assignments.txt by wc -l
#grep -c  k__Bact = 1296 + grep Unassigned 878; 34 Chloroplasts; 5 mitochondria (need to remove)


> cd pynast_aligned_seqs
#1576 sequences in rep_set_failures.fasta; check the first 3 in BLAST to see if really bad
#All really were garbage (18S, mitochondria, short), so want to move forward using the otu_table_mc2_w_tax_no_pynast_failures.biom rather than otu_table_mc2_w_tax.biom 

> cd usearch61_openref_Green
> filter_taxa_from_otu_table.py -i otu_table_mc2_w_tax_no_pynast_failures.biom -o otu_table_final.biom -n c__Chloroplast,f__mitochondria

#Removing Chloroplast and mitochondrial sequences.
# > biom convert -i file.biom -o file.txt --to-tsv  to make input & output biom tables viewable as text, and use > wc -l file.txt to count lines
#Input 1388, output 1352 (confirmed removal of chloroplasts and mitochondria)


> biom summarize_table -i otu_table_final.biom -o summary_otu_table_final.txt

#Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction
#Num samples: 23
#Num observations: 1350
#Total count: 325694
#Table density (fraction of non-zero values): 0.161

#Counts/sample summary:
#Min: 689.0
#Max: 44282.0
#Median: 7546.000
#Mean: 14160.609
#Std. dev.: 13012.255

#Counts/sample detail:
#NF12Q7: 689.0
#NF20Q1: 1099.0
#AKPP3: 2328.0
#RB687Q2: 2557.0
#RB684Q1: 2561.0
#RB684Q3: 3176.0
#NF10Q6: 4261.0
#RB684Q4: 4753.0
#NF9Q7: 5024.0
#RB684Q5: 5820.0
#NF9Q6: 6773.0
#NF6Q6: 7546.0
#NF5Q7: 11425.0
#NF6Q7: 14653.0
#NF2Q6: 17031.0
#AKPP4: 22140.0
#AK325: 22242.0
#NF1Q6: 22448.0
#AKUT1: 23068.0
#NF5Q6: 23285.0
#AK342: 35370.0
#AKPP1: 43163.0
#AKPP2: 44282.0

> filter_samples_from_otu_table.py -i otu_table_final.biom -o filtered_otu_table.biom --sample_id_fp ids.txt --negate_sample_id_fp

#Created file ids.txt listing the three samples with counts below 2,500: NF12Q7, NF20Q1, AKPP3
#Script discards samples were the id is listed in ids.txt


> biom summarize_table -i filtered_otu_table.biom -o summary_otu_table_filtered_final.txt

#Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction
#Num samples: 20
#Num observations: 1350
#Total count: 321578
#Table density (fraction of non-zero values): 0.174

#Counts/sample summary:
#Min: 2557.0
#Max: 44282.0
#Median: 13039.000
#Mean: 16078.900
#Std. dev.: 12900.843
#Sample Metadata Categories: None provided
#Observation Metadata Categories: taxonomy

#Counts/sample detail:
#RB687Q2: 2557.0
#RB684Q1: 2561.0
#RB684Q3: 3176.0
#NF10Q6: 4261.0
#RB684Q4: 4753.0
#NF9Q7: 5024.0
#RB684Q5: 5820.0
#NF9Q6: 6773.0
#NF6Q6: 7546.0
#NF5Q7: 11425.0
#NF6Q7: 14653.0
#NF2Q6: 17031.0
#AKPP4: 22140.0
#AK325: 22242.0
#NF1Q6: 22448.0
#AKUT1: 23068.0
#NF5Q6: 23285.0
#AK342: 35370.0
#AKPP1: 43163.0
#AKPP2: 44282.0


> single_rarefaction.py -i filtered_otu_table.biom -o otu_table_final_rarefied2557.biom -d 2557

#Rarefaction step to compare all samples to the depth of 2557 sequences


> biom summarize_table -i otu_table_final_rarefied2557.biom -o summary_otu_table_rarefied2557.txt

#Examine output to confirm all samples are now 2557 each


> mkdir Alpha_Diversity_rare2557/

#Need output directory for next steps	


> alpha_diversity.py -i otu_table_final_rarefied2557.biom -m observed_species,observed_otus,ace,chao1,simpson_reciprocal,shannon,simpson_e -o Alpha_Diversity_rare2557/Alpha_Diversity_rare2557.txt -t rep_set.tre

#more Alpha_Diversity_rare2557/Alpha_Diversity_rare2557.txt

#       observed_species        observed_otus   ace     chao1   simpson_reciprocal      shannon simpson_e
#NF2Q6   314.0   314.0   448.65186867    414.563380282   12.625880331    5.6756240279    0.0402098099713
#AKPP2   53.0    53.0    101.291870152   86.8333333333   1.16715200942   0.709390509739  0.0220217360268
#NF9Q6   235.0   235.0   304.822207824   300.282608696   10.0850502305   5.25340748334   0.0429151073639
#NF1Q6   84.0    84.0    132.832402235   115.5   1.87549219756   1.92447929312   0.0223272880661
#RB684Q3 273.0   273.0   313.926793323   318.023809524   14.6602441786   6.12407222775   0.0537005281268
#RB684Q4 183.0   183.0   238.735125297   232.875 9.89053808953   5.00662023379   0.0540466562269
#RB687Q2 159.0   159.0   227.778077558   246.142857143   11.3948266878   4.7776642857    0.0716655766526
#AKPP1   32.0    32.0    78.3586989183   84.5    1.06039169423   0.308480258689  0.0331372404446
#AKPP4   100.0   100.0   159.723796018   145.15  3.37841102918   2.80631386303   0.0337841102918
#NF6Q7   297.0   297.0   427.074861414   402.217391304   11.0151456697   5.21618214413   0.0370880325578
#NF5Q7   207.0   207.0   291.865402331   273.5   7.26080834217   4.58198250376   0.0350763688028
#NF6Q6   219.0   219.0   270.610375448   265.5   11.4044862682   5.20890660856   0.0520752797635
#RB684Q1 178.0   178.0   242.39223404    233.617647059   10.837888731    4.75024705853   0.060887015343
#RB684Q5 241.0   241.0   294.421417052   289.372093023   6.318339204     5.21123020897   0.0262171751204
#AKUT1   98.0    98.0    160.370966277   152.473684211   1.82174215779   1.95272476934   0.0185892056918
#AK325   141.0   141.0   179.604141543   176.357142857   8.60272150375   4.23738110628   0.0610122092464
#NF10Q6  168.0   168.0   234.504513853   233.807692308   19.552706263    5.2826859802    0.116385156328
#NF5Q6   66.0    66.0    91.5051015491   91.0    1.44381513952   1.35622410708   0.0218759869624
#NF9Q7   115.0   115.0   128.054991827   128.125 7.38816086096   4.37408121565   0.0642448770518
#AK342   47.0    47.0    72.4943950575   61.25   1.44981809223   1.22366775286   0.0308471934518


> summarize_taxa_through_plots.py -o Alpha_Diversity_rare2557/taxa_summary2557/ -i otu_table_final_rarefied2557.biom 

#Generates everyone's favorite bar graphs of relative abundance at Phylum to Genus level


> alpha_rarefaction.py -i otu_table_final_rarefied2557.biom -o Rarefaction/ -t rep_set.tre -m Primnoa_map.txt -e 2557

#Generates rarefaction curves to see if sequencing was exhaustive enough


> beta_diversity.py -i otu_table_final_rarefied2557.biom -m unweighted_unifrac,weighted_unifrac,binary_sorensen_dice,bray_curtis -o compar_div_rare2557/ -t rep_set.tre

#To compare weighted/uweighted and phylogenetic/taxonomic metrics


> beta_significance.py -i otu_table_final_rarefied2557.biom -t rep_set.tre -s unweighted_unifrac -o unw_sig.txt

#Determining if samples are statistically significantly different from each other
#Default 100 monte carlo randomizations
#Unweighted unifrac (so abundance doesn't matter, just presence/absence of taxa)

#unweighted unifrac significance test
sample 1        sample 2        p value p value (Bonferroni corrected)
AK325   AK342   0.22    1.0
AK325   AKPP1   0.73    1.0
AK325   AKPP2   0.07    1.0
AK325   AKPP4   0.0     <=1.0e-02
AK325   AKUT1   0.11    1.0
AK325   NF10Q6  0.01    1.0
AK325   NF1Q6   0.21    1.0
AK325   NF2Q6   0.0     <=1.0e-02
AK325   NF5Q6   0.03    1.0
AK325   NF5Q7   0.04    1.0
AK325   NF6Q6   0.0     <=1.0e-02
AK325   NF6Q7   0.0     <=1.0e-02
AK325   NF9Q6   0.0     <=1.0e-02
AK325   NF9Q7   0.0     <=1.0e-02
AK325   RB684Q1 0.01    1.0
AK325   RB684Q3 0.0     <=1.0e-02
AK325   RB684Q4 0.0     <=1.0e-02
AK325   RB684Q5 0.0     <=1.0e-02
AK325   RB687Q2 0.03    1.0
AK342   AKPP1   0.9     1.0
AK342   AKPP2   0.96    1.0
AK342   AKPP4   0.8     1.0
AK342   AKUT1   0.93    1.0
AK342   NF10Q6  0.16    1.0
AK342   NF1Q6   0.84    1.0
AK342   NF2Q6   0.01    1.0
AK342   NF5Q6   0.01    1.0
AK342   NF5Q7   0.08    1.0
AK342   NF6Q6   0.12    1.0
AK342   NF6Q7   0.01    1.0
AK342   NF9Q6   0.07    1.0
AK342   NF9Q7   0.06    1.0
AK342   RB684Q1 0.34    1.0
AK342   RB684Q3 0.07    1.0
AK342   RB684Q4 0.14    1.0
AK342   RB684Q5 0.03    1.0
AK342   RB687Q2 0.48    1.0
AKPP1   AKPP2   0.92    1.0
AKPP1   AKPP4   1.0     1.0
AKPP1   AKUT1   0.72    1.0
AKPP1   NF10Q6  0.48    1.0
AKPP1   NF1Q6   0.75    1.0
AKPP1   NF2Q6   0.42    1.0
AKPP1   NF5Q6   0.22    1.0
AKPP1   NF5Q7   0.48    1.0
AKPP1   NF6Q6   0.85    1.0
AKPP1   NF6Q7   0.49    1.0
AKPP1   NF9Q6   0.46    1.0
AKPP1   NF9Q7   0.41    1.0
AKPP1   RB684Q1 0.48    1.0
AKPP1   RB684Q3 0.59    1.0
AKPP1   RB684Q4 0.96    1.0
AKPP1   RB684Q5 0.68    1.0
AKPP1   RB687Q2 0.87    1.0
AKPP2   AKPP4   0.44    1.0
AKPP2   AKUT1   0.24    1.0
AKPP2   NF10Q6  0.19    1.0
AKPP2   NF1Q6   0.21    1.0
AKPP2   NF2Q6   0.03    1.0
AKPP2   NF5Q6   0.13    1.0
AKPP2   NF5Q7   0.03    1.0
AKPP2   NF6Q6   0.36    1.0
AKPP2   NF6Q7   0.0     <=1.0e-02
AKPP2   NF9Q6   0.03    1.0
AKPP2   NF9Q7   0.01    1.0
AKPP2   RB684Q1 0.19    1.0
AKPP2   RB684Q3 0.02    1.0
AKPP2   RB684Q4 0.51    1.0
AKPP2   RB684Q5 0.03    1.0
AKPP2   RB687Q2 0.14    1.0
AKPP4   AKUT1   0.11    1.0
AKPP4   NF10Q6  0.0     <=1.0e-02
AKPP4   NF1Q6   0.09    1.0
AKPP4   NF2Q6   0.0     <=1.0e-02
AKPP4   NF5Q6   0.02    1.0
AKPP4   NF5Q7   0.0     <=1.0e-02
AKPP4   NF6Q6   0.01    1.0
AKPP4   NF6Q7   0.01    1.0
AKPP4   NF9Q6   0.0     <=1.0e-02
AKPP4   NF9Q7   0.0     <=1.0e-02
AKPP4   RB684Q1 0.0     <=1.0e-02
AKPP4   RB684Q3 0.0     <=1.0e-02
AKPP4   RB684Q4 0.0     <=1.0e-02
AKPP4   RB684Q5 0.0     <=1.0e-02
AKPP4   RB687Q2 0.08    1.0
AKUT1   NF10Q6  0.0     <=1.0e-02
AKUT1   NF2Q6   0.01    1.0
AKUT1   NF5Q6   0.01    1.0
AKUT1   NF5Q7   0.02    1.0
AKUT1   NF6Q6   0.01    1.0
AKUT1   NF6Q7   0.0     <=1.0e-02
AKUT1   NF9Q6   0.0     <=1.0e-02
AKUT1   NF9Q7   0.0     <=1.0e-02
AKUT1   RB684Q1 0.0     <=1.0e-02
AKUT1   RB684Q3 0.0     <=1.0e-02
AKUT1   RB684Q4 0.0     <=1.0e-02
AKUT1   RB684Q5 0.0     <=1.0e-02
AKUT1   RB687Q2 0.05    1.0
NF10Q6  NF1Q6   0.11    1.0
NF10Q6  NF2Q6   0.1     1.0
NF10Q6  NF5Q6   0.09    1.0
NF10Q6  NF5Q7   0.01    1.0
NF10Q6  NF6Q6   0.0     <=1.0e-02
NF10Q6  NF6Q7   0.0     <=1.0e-02
NF10Q6  NF9Q6   0.01    1.0
NF10Q6  NF9Q7   0.0     <=1.0e-02
NF10Q6  RB684Q1 0.17    1.0
NF10Q6  RB684Q3 0.05    1.0
NF10Q6  RB684Q4 0.04    1.0
NF10Q6  RB684Q5 0.0     <=1.0e-02
NF10Q6  RB687Q2 0.18    1.0
NF1Q6   NF2Q6   0.11    1.0
NF1Q6   NF5Q6   0.03    1.0
NF1Q6   NF5Q7   0.41    1.0
NF1Q6   NF6Q6   0.0     <=1.0e-02
NF1Q6   NF6Q7   0.02    1.0
NF1Q6   NF9Q6   0.21    1.0
NF1Q6   NF9Q7   0.16    1.0
NF1Q6   RB684Q1 0.68    1.0
NF1Q6   RB684Q3 0.09    1.0
NF1Q6   RB684Q4 0.35    1.0
NF1Q6   RB684Q5 0.04    1.0
NF1Q6   RB687Q2 0.61    1.0
NF2Q6   NF5Q6   0.38    1.0
NF2Q6   NF5Q7   0.01    1.0
NF2Q6   NF6Q6   0.0     <=1.0e-02
NF2Q6   NF6Q7   0.0     <=1.0e-02
NF2Q6   NF9Q6   0.0     <=1.0e-02
NF2Q6   NF9Q7   0.0     <=1.0e-02
NF2Q6   RB684Q1 0.03    1.0
NF2Q6   RB684Q3 0.0     <=1.0e-02
NF2Q6   RB684Q4 0.01    1.0
NF2Q6   RB684Q5 0.0     <=1.0e-02
NF2Q6   RB687Q2 0.01    1.0
NF5Q6   NF5Q7   0.08    1.0
NF5Q6   NF6Q6   0.01    1.0
NF5Q6   NF6Q7   0.0     <=1.0e-02
NF5Q6   NF9Q6   0.04    1.0
NF5Q6   NF9Q7   0.01    1.0
NF5Q6   RB684Q1 0.27    1.0
NF5Q6   RB684Q3 0.17    1.0
NF5Q6   RB684Q4 0.05    1.0
NF5Q6   RB684Q5 0.01    1.0
NF5Q6   RB687Q2 0.05    1.0
NF5Q7   NF6Q6   0.0     <=1.0e-02
NF5Q7   NF6Q7   0.0     <=1.0e-02
NF5Q7   NF9Q6   0.05    1.0
NF5Q7   NF9Q7   0.0     <=1.0e-02
NF5Q7   RB684Q1 0.12    1.0
NF5Q7   RB684Q3 0.01    1.0
NF5Q7   RB684Q4 0.1     1.0
NF5Q7   RB684Q5 0.02    1.0
NF5Q7   RB687Q2 0.07    1.0
NF6Q6   NF6Q7   0.0     <=1.0e-02
NF6Q6   NF9Q6   0.0     <=1.0e-02
NF6Q6   NF9Q7   0.0     <=1.0e-02
NF6Q6   RB684Q1 0.07    1.0
NF6Q6   RB684Q3 0.0     <=1.0e-02
NF6Q6   RB684Q4 0.0     <=1.0e-02
NF6Q6   RB684Q5 0.0     <=1.0e-02
NF6Q6   RB687Q2 0.0     <=1.0e-02
NF6Q7   NF9Q6   0.0     <=1.0e-02
NF6Q7   NF9Q7   0.0     <=1.0e-02
NF6Q7   RB684Q1 0.0     <=1.0e-02
NF6Q7   RB684Q3 0.0     <=1.0e-02
NF6Q7   RB684Q4 0.0     <=1.0e-02
NF6Q7   RB684Q5 0.0     <=1.0e-02
NF6Q7   RB687Q2 0.0     <=1.0e-02
NF9Q6   NF9Q7   0.0     <=1.0e-02
NF9Q6   RB684Q1 0.37    1.0
NF9Q6   RB684Q3 0.02    1.0
NF9Q6   RB684Q4 0.26    1.0
NF9Q6   RB684Q5 0.04    1.0
NF9Q6   RB687Q2 0.21    1.0
NF9Q7   RB684Q1 0.02    1.0
NF9Q7   RB684Q3 0.0     <=1.0e-02
NF9Q7   RB684Q4 0.0     <=1.0e-02
NF9Q7   RB684Q5 0.0     <=1.0e-02
NF9Q7   RB687Q2 0.0     <=1.0e-02
RB684Q1 RB684Q3 0.06    1.0
RB684Q1 RB684Q4 0.03    1.0
RB684Q1 RB684Q5 0.0     <=1.0e-02
RB684Q1 RB687Q2 0.06    1.0
RB684Q3 RB684Q4 0.0     <=1.0e-02
RB684Q3 RB684Q5 0.14    1.0
RB684Q3 RB687Q2 0.04    1.0
RB684Q4 RB684Q5 0.02    1.0
RB684Q4 RB687Q2 0.06    1.0
RB684Q5 RB687Q2 0.01    1.0


> beta_significance.py -i otu_table_final_rarefied2557.biom -t rep_set.tre -s weighted_unifrac -o w_sig.txt

#Determining if samples are statistically significantly different from each other
#Default 100 monte carlo randomizations
#Weighted unifrac (so abundance matters)

#weighted unifrac significance test
sample 1        sample 2        p value p value (Bonferroni corrected)
AK325   AK342   0.53    1.0
AK325   AKPP1   0.54    1.0
AK325   AKPP2   0.51    1.0
AK325   AKPP4   0.48    1.0
AK325   AKUT1   0.59    1.0
AK325   NF10Q6  0.26    1.0
AK325   NF1Q6   0.7     1.0
AK325   NF2Q6   0.29    1.0
AK325   NF5Q6   0.84    1.0
AK325   NF5Q7   0.48    1.0
AK325   NF6Q6   0.38    1.0
AK325   NF6Q7   0.34    1.0
AK325   NF9Q6   0.62    1.0
AK325   NF9Q7   0.79    1.0
AK325   RB684Q1 0.87    1.0
AK325   RB684Q3 0.26    1.0
AK325   RB684Q4 0.5     1.0
AK325   RB684Q5 0.28    1.0
AK325   RB687Q2 0.86    1.0
AK342   AKPP1   0.36    1.0
AK342   AKPP2   0.42    1.0
AK342   AKPP4   0.63    1.0
AK342   AKUT1   0.59    1.0
AK342   NF10Q6  0.4     1.0
AK342   NF1Q6   0.61    1.0
AK342   NF2Q6   0.62    1.0
AK342   NF5Q6   0.53    1.0
AK342   NF5Q7   0.55    1.0
AK342   NF6Q6   0.66    1.0
AK342   NF6Q7   0.45    1.0
AK342   NF9Q6   0.47    1.0
AK342   NF9Q7   0.48    1.0
AK342   RB684Q1 0.59    1.0
AK342   RB684Q3 0.4     1.0
AK342   RB684Q4 0.34    1.0
AK342   RB684Q5 0.38    1.0
AK342   RB687Q2 0.52    1.0
AKPP1   AKPP2   0.62    1.0
AKPP1   AKPP4   0.6     1.0
AKPP1   AKUT1   0.35    1.0
AKPP1   NF10Q6  0.32    1.0
AKPP1   NF1Q6   0.6     1.0
AKPP1   NF2Q6   0.52    1.0
AKPP1   NF5Q6   0.48    1.0
AKPP1   NF5Q7   0.55    1.0
AKPP1   NF6Q6   0.61    1.0
AKPP1   NF6Q7   0.48    1.0
AKPP1   NF9Q6   0.49    1.0
AKPP1   NF9Q7   0.48    1.0
AKPP1   RB684Q1 0.51    1.0
AKPP1   RB684Q3 0.45    1.0
AKPP1   RB684Q4 0.38    1.0
AKPP1   RB684Q5 0.45    1.0
AKPP1   RB687Q2 0.44    1.0
AKPP2   AKPP4   0.5     1.0
AKPP2   AKUT1   0.33    1.0
AKPP2   NF10Q6  0.38    1.0
AKPP2   NF1Q6   0.57    1.0
AKPP2   NF2Q6   0.54    1.0
AKPP2   NF5Q6   0.5     1.0
AKPP2   NF5Q7   0.58    1.0
AKPP2   NF6Q6   0.53    1.0
AKPP2   NF6Q7   0.48    1.0
AKPP2   NF9Q6   0.53    1.0
AKPP2   NF9Q7   0.47    1.0
AKPP2   RB684Q1 0.46    1.0
AKPP2   RB684Q3 0.36    1.0
AKPP2   RB684Q4 0.41    1.0
AKPP2   RB684Q5 0.35    1.0
AKPP2   RB687Q2 0.45    1.0
AKPP4   AKUT1   0.79    1.0
AKPP4   NF10Q6  0.4     1.0
AKPP4   NF1Q6   0.63    1.0
AKPP4   NF2Q6   0.48    1.0
AKPP4   NF5Q6   0.61    1.0
AKPP4   NF5Q7   0.35    1.0
AKPP4   NF6Q6   0.36    1.0
AKPP4   NF6Q7   0.41    1.0
AKPP4   NF9Q6   0.49    1.0
AKPP4   NF9Q7   0.63    1.0
AKPP4   RB684Q1 0.64    1.0
AKPP4   RB684Q3 0.35    1.0
AKPP4   RB684Q4 0.48    1.0
AKPP4   RB684Q5 0.48    1.0
AKPP4   RB687Q2 0.54    1.0
AKUT1   NF10Q6  0.47    1.0
AKUT1   NF1Q6   0.6     1.0
AKUT1   NF2Q6   0.45    1.0
AKUT1   NF5Q6   0.59    1.0
AKUT1   NF5Q7   0.58    1.0
AKUT1   NF6Q6   0.57    1.0
AKUT1   NF6Q7   0.53    1.0
AKUT1   NF9Q6   0.44    1.0
AKUT1   NF9Q7   0.45    1.0
AKUT1   RB684Q1 0.58    1.0
AKUT1   RB684Q3 0.53    1.0
AKUT1   RB684Q4 0.41    1.0
AKUT1   RB684Q5 0.36    1.0
AKUT1   RB687Q2 0.61    1.0
NF10Q6  NF1Q6   0.65    1.0
NF10Q6  NF2Q6   0.74    1.0
NF10Q6  NF5Q6   0.74    1.0
NF10Q6  NF5Q7   0.74    1.0
NF10Q6  NF6Q6   0.86    1.0
NF10Q6  NF6Q7   0.95    1.0
NF10Q6  NF9Q6   0.68    1.0
NF10Q6  NF9Q7   0.68    1.0
NF10Q6  RB684Q1 0.9     1.0
NF10Q6  RB684Q3 0.65    1.0
NF10Q6  RB684Q4 0.94    1.0
NF10Q6  RB684Q5 0.61    1.0
NF10Q6  RB687Q2 0.64    1.0
NF1Q6   NF2Q6   0.73    1.0
NF1Q6   NF5Q6   0.97    1.0
NF1Q6   NF5Q7   0.85    1.0
NF1Q6   NF6Q6   0.77    1.0
NF1Q6   NF6Q7   0.68    1.0
NF1Q6   NF9Q6   0.91    1.0
NF1Q6   NF9Q7   0.9     1.0
NF1Q6   RB684Q1 0.96    1.0
NF1Q6   RB684Q3 0.55    1.0
NF1Q6   RB684Q4 0.83    1.0
NF1Q6   RB684Q5 0.47    1.0
NF1Q6   RB687Q2 0.73    1.0
NF2Q6   NF5Q6   0.86    1.0
NF2Q6   NF5Q7   0.38    1.0
NF2Q6   NF6Q6   0.27    1.0
NF2Q6   NF6Q7   0.85    1.0
NF2Q6   NF9Q6   0.7     1.0
NF2Q6   NF9Q7   0.72    1.0
NF2Q6   RB684Q1 0.85    1.0
NF2Q6   RB684Q3 0.76    1.0
NF2Q6   RB684Q4 0.76    1.0
NF2Q6   RB684Q5 0.57    1.0
NF2Q6   RB687Q2 0.55    1.0
NF5Q6   NF5Q7   0.81    1.0
NF5Q6   NF6Q6   0.84    1.0
NF5Q6   NF6Q7   0.75    1.0
NF5Q6   NF9Q6   0.91    1.0
NF5Q6   NF9Q7   0.76    1.0
NF5Q6   RB684Q1 0.98    1.0
NF5Q6   RB684Q3 0.66    1.0
NF5Q6   RB684Q4 0.81    1.0
NF5Q6   RB684Q5 0.7     1.0
NF5Q6   RB687Q2 0.82    1.0
NF5Q7   NF6Q6   0.61    1.0
NF5Q7   NF6Q7   0.81    1.0
NF5Q7   NF9Q6   0.9     1.0
NF5Q7   NF9Q7   0.94    1.0
NF5Q7   RB684Q1 0.9     1.0
NF5Q7   RB684Q3 0.72    1.0
NF5Q7   RB684Q4 0.96    1.0
NF5Q7   RB684Q5 0.85    1.0
NF5Q7   RB687Q2 0.93    1.0
NF6Q6   NF6Q7   0.81    1.0
NF6Q6   NF9Q6   0.82    1.0
NF6Q6   NF9Q7   0.91    1.0
NF6Q6   RB684Q1 0.99    1.0
NF6Q6   RB684Q3 0.91    1.0
NF6Q6   RB684Q4 0.98    1.0
NF6Q6   RB684Q5 0.93    1.0
NF6Q6   RB687Q2 0.98    1.0
NF6Q7   NF9Q6   0.73    1.0
NF6Q7   NF9Q7   0.79    1.0
NF6Q7   RB684Q1 0.99    1.0
NF6Q7   RB684Q3 0.67    1.0
NF6Q7   RB684Q4 0.88    1.0
NF6Q7   RB684Q5 0.81    1.0
NF6Q7   RB687Q2 0.76    1.0
NF9Q6   NF9Q7   0.7     1.0
NF9Q6   RB684Q1 1.0     1.0
NF9Q6   RB684Q3 0.73    1.0
NF9Q6   RB684Q4 0.95    1.0
NF9Q6   RB684Q5 0.68    1.0
NF9Q6   RB687Q2 0.9     1.0
NF9Q7   RB684Q1 0.94    1.0
NF9Q7   RB684Q3 0.9     1.0
NF9Q7   RB684Q4 1.0     1.0
NF9Q7   RB684Q5 0.91    1.0
NF9Q7   RB687Q2 0.98    1.0
RB684Q1 RB684Q3 0.29    1.0
RB684Q1 RB684Q4 0.54    1.0
RB684Q1 RB684Q5 0.28    1.0
RB684Q1 RB687Q2 0.4     1.0
RB684Q3 RB684Q4 0.02    1.0
RB684Q3 RB684Q5 0.67    1.0
RB684Q3 RB687Q2 0.33    1.0
RB684Q4 RB684Q5 0.17    1.0
RB684Q4 RB687Q2 0.89    1.0
RB684Q5 RB687Q2 0.59    1.0


> principal_coordinates.py -i compar_div_rare2557/ -o compar_div_rare2557_PCoA/

#Conducts principal coordinate analysis on each of the beta diversity stats (e.g., unweighted unifrac)


> make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_weighted_unifrac_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_WU/
	
#Making 2D PCoA plots based on weighted unifrac OTU table


> make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_unweighted_unifrac_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_UU/

#Making 2D PCoA plots based on unweighted unifrac OTU table


> make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_binary_sorensen_dice_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BSD/

#Making 2D PCoA plots based on binary sorensen dice OTU table


> make_2d_plots.py -i compar_div_rare2557_PCoA/pcoa_bray_curtis_otu_table_final_rarefied2557.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BC/

#Making 2D PCoA plots based on bray curtis OTU table


> compute_core_microbiome.py -i otu_table_final_rarefied2557.biom -o otu_table_core

#Identifying the core microbiome (shared OTUs) 


> summarize_taxa_through_plots.py -o Core_taxa_summary2557/ -i otu_table_core/core_table_100.biom 
	
#Visualizing core as bar graphs

> make_otu_heatmap.py -i Core_taxa_summary2557/core_table_100_L5.biom -o heatmap_core_family.pdf

#Visualizing core through heatmap at Family level

__________________________________________________________________________________________
__________________________________________________________________________________________

Going back and doing separate analysis of only the samples with >10,000 sequences

> pick_open_reference_otus.py -i outseqs_10K.fasta -m usearch61 -o usearch61_openref_Green_10K -f 


> cd uclust_assigned_taxonomy

#1929 lines in rep_set_tax_assignments.txt by wc -l
#grep -c  k__Bact = 1124 + grep Unassigned 805; 31 Chloroplasts; 5 mitochondria (need to remove)


> cd pynast_aligned_seqs
#1478 sequences in rep_set_failures.fasta; check the first 3 in BLAST to see if really bad
#All really were garbage (18S, mitochondria, short), so want to move forward using the otu_table_mc2_w_tax_no_pynast_failures.biom rather than otu_table_mc2_w_tax.biom 


> cd usearch61_openref_Green_10K
> filter_taxa_from_otu_table.py -i otu_table_mc2_w_tax_no_pynast_failures.biom -o otu_table_final.biom -n c__Chloroplast,f__mitochondria

#Removing Chloroplast and mitochondrial sequences.
# > biom convert -i file.biom -o file.txt --to-tsv  to make input & output biom tables viewable as text, and use > wc -l file.txt to count lines
#Input 1192, output 1159 (confirmed removal of chloroplasts and mitochondria)


> biom summarize_table -i otu_table_final.biom -o summary_otu_table_final.txt

#Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction
Num samples: 14
Num observations: 1157
Total count: 294504
Table density (fraction of non-zero values): 0.205

Counts/sample summary:
 Min: 1096.0
 Max: 44282.0
 Median: 22190.500
 Mean: 21036.000
 Std. dev.: 12483.262
 Sample Metadata Categories: None provided
 Observation Metadata Categories: taxonomy

Counts/sample detail:
 NF20Q1: 1096.0
 NF9Q6: 6772.0
 NF6Q6: 7541.0
 NF5Q7: 11422.0
 NF6Q7: 14652.0
 NF2Q6: 17030.0
 AKPP4: 22139.0
 AK325: 22242.0
 NF1Q6: 22447.0
 AKUT1: 23064.0
 NF5Q6: 23285.0
 AK342: 35370.0
 AKPP1: 43162.0
 AKPP2: 44282.0


> filter_samples_from_otu_table.py -i otu_table_final.biom -o filtered_otu_table.biom --sample_id_fp ids2.txt --negate_sample_id_fp

#Created file ids2.txt listing the three samples with counts below 10,000: NF20Q1, NF9Q6, NF6Q6
#Script discards samples were the id is listed in ids.txt

> biom summarize_table -i filtered_otu_table.biom -o summary_otu_table_filtered_final.txt

#Use 'more' to look at output table. Gives seq numbers so you can determine lowest sample number for rarefaction
Num samples: 11
Num observations: 1157
Total count: 279095
Table density (fraction of non-zero values): 0.210

Counts/sample summary:
 Min: 11422.0
 Max: 44282.0
 Median: 22447.000
 Mean: 25372.273
 Std. dev.: 10408.167
 Sample Metadata Categories: None provided
 Observation Metadata Categories: taxonomy

Counts/sample detail:
 NF5Q7: 11422.0
 NF6Q7: 14652.0
 NF2Q6: 17030.0
 AKPP4: 22139.0
 AK325: 22242.0
 NF1Q6: 22447.0
 AKUT1: 23064.0
 NF5Q6: 23285.0
 AK342: 35370.0
 AKPP1: 43162.0
 AKPP2: 44282.0


> single_rarefaction.py -i filtered_otu_table.biom -o otu_table_final_rarefied11422.biom -d 11422

#Rarefaction step to compare all samples to the depth of 11,422 sequences


> biom summarize_table -i otu_table_final_rarefied11422.biom -o summary_otu_table_rarefied11422.txt

#Examine output to confirm all samples are now 11,422 each


> mkdir Alpha_Diversity_rare11422/


> alpha_diversity.py -i otu_table_final_rarefied11422.biom -m observed_otus,ace,chao1,simpson_reciprocal,shannon,simpson_e -o Alpha_Diversity_rare11422/Alpha_Diversity_rare11422.txt -t rep_set.tre

#more Alpha_Diversity_rare2557/Alpha_Diversity_rare11422.txt

        observed_otus   ace     chao1   simpson_reciprocal      shannon simpson_e
NF2Q6   451.0   496.450394598   508.37254902    11.8855857097   5.69753801711   0.0263538485802
AKPP2   101.0   131.376641022   122.75  1.18591120003   0.808283510251  0.0117416950498
NF1Q6   165.0   230.316872428   233.875 1.87451087269   1.96655696729   0.0113606719557
AKPP1   71.0    117.356767299   109.153846154   1.06604024443   0.348417133241  0.01501465133
AKPP4   192.0   234.682478973   230.277777778   3.44859711006   2.90475710982   0.0179614432815
NF6Q7   477.0   524.313465064   526.567164179   10.8122002368   5.35961554883   0.0226670864504
NF5Q7   334.0   370.715337816   363.5   7.35321114775   4.70351212322   0.0220156022388
AKUT1   154.0   183.688219405   185.2   1.78756875542   1.92482699417   0.0116075893209
AK325   189.0   202.769712432   202.636363636   8.40272917521   4.26429987699   0.0444588845249
NF5Q6   113.0   135.758157895   127.04  1.43880268648   1.36412175706   0.012732767137
AK342   83.0    147.830779533   125.5   1.41035486193   1.16497023068   0.0169922272521


> summarize_taxa_through_plots.py -o Alpha_Diversity_rare11422/taxa_summary11422/ -i otu_table_final_rarefied11422.biom 


> alpha_rarefaction.py -i otu_table_final_rarefied11422.biom -o Rarefaction/ -t rep_set.tre -m Primnoa_map.txt -e 11422


> beta_diversity.py -i otu_table_final_rarefied11422.biom -m unweighted_unifrac,weighted_unifrac,binary_sorensen_dice,bray_curtis -o compar_div_rare11422/ -t rep_set.tre


> beta_significance.py -i otu_table_final_rarefied11422.biom -t rep_set.tre -s unweighted_unifrac -o unw_sig.txt

#unweighted unifrac significance test
sample 1        sample 2        p value p value (Bonferroni corrected)
AK325   AK342   0.0     <=1.0e-02
AK325   AKPP1   0.0     <=1.0e-02
AK325   AKPP2   0.0     <=1.0e-02
AK325   AKPP4   0.0     <=1.0e-02
AK325   AKUT1   0.0     <=1.0e-02
AK325   NF1Q6   0.0     <=1.0e-02
AK325   NF2Q6   0.0     <=1.0e-02
AK325   NF5Q6   0.02    1.0
AK325   NF5Q7   0.0     <=1.0e-02
AK325   NF6Q7   0.0     <=1.0e-02
AK342   AKPP1   0.64    1.0
AK342   AKPP2   0.04    1.0
AK342   AKPP4   0.03    1.0
AK342   AKUT1   0.32    1.0
AK342   NF1Q6   0.0     <=1.0e-02
AK342   NF2Q6   0.0     <=1.0e-02
AK342   NF5Q6   0.07    1.0
AK342   NF5Q7   0.02    1.0
AK342   NF6Q7   0.0     <=1.0e-02
AKPP1   AKPP2   0.79    1.0
AKPP1   AKPP4   0.82    1.0
AKPP1   AKUT1   0.15    1.0
AKPP1   NF1Q6   0.43    1.0
AKPP1   NF2Q6   0.01    0.55
AKPP1   NF5Q6   0.19    1.0
AKPP1   NF5Q7   0.37    1.0
AKPP1   NF6Q7   0.04    1.0
AKPP2   AKPP4   0.09    1.0
AKPP2   AKUT1   0.0     <=1.0e-02
AKPP2   NF1Q6   0.0     <=1.0e-02
AKPP2   NF2Q6   0.0     <=1.0e-02
AKPP2   NF5Q6   0.01    0.55
AKPP2   NF5Q7   0.0     <=1.0e-02
AKPP2   NF6Q7   0.0     <=1.0e-02
AKPP4   AKUT1   0.01    0.55
AKPP4   NF1Q6   0.0     <=1.0e-02
AKPP4   NF2Q6   0.0     <=1.0e-02
AKPP4   NF5Q6   0.12    1.0
AKPP4   NF5Q7   0.0     <=1.0e-02
AKPP4   NF6Q7   0.0     <=1.0e-02
AKUT1   NF1Q6   0.0     <=1.0e-02
AKUT1   NF2Q6   0.0     <=1.0e-02
AKUT1   NF5Q6   0.01    0.55
AKUT1   NF5Q7   0.0     <=1.0e-02
AKUT1   NF6Q7   0.0     <=1.0e-02
NF1Q6   NF2Q6   0.0     <=1.0e-02
NF1Q6   NF5Q6   0.34    1.0
NF1Q6   NF5Q7   0.0     <=1.0e-02
NF1Q6   NF6Q7   0.0     <=1.0e-02
NF2Q6   NF5Q6   0.06    1.0
NF2Q6   NF5Q7   0.0     <=1.0e-02
NF2Q6   NF6Q7   0.0     <=1.0e-02
NF5Q6   NF5Q7   0.04    1.0
NF5Q6   NF6Q7   0.0     <=1.0e-02
NF5Q7   NF6Q7   0.0     <=1.0e-02


> beta_significance.py -i otu_table_final_rarefied11422.biom -t rep_set.tre -s weighted_unifrac -o w_sig.txt

#weighted unifrac significance test
sample 1        sample 2        p value p value (Bonferroni corrected)
AK325   AK342   0.36    1.0
AK325   AKPP1   0.3     1.0
AK325   AKPP2   0.35    1.0
AK325   AKPP4   0.2     1.0
AK325   AKUT1   0.33    1.0
AK325   NF1Q6   0.9     1.0
AK325   NF2Q6   0.26    1.0
AK325   NF5Q6   0.97    1.0
AK325   NF5Q7   0.45    1.0
AK325   NF6Q7   0.47    1.0
AK342   AKPP1   0.26    1.0
AK342   AKPP2   0.35    1.0
AK342   AKPP4   0.52    1.0
AK342   AKUT1   0.34    1.0
AK342   NF1Q6   0.63    1.0
AK342   NF2Q6   0.41    1.0
AK342   NF5Q6   0.57    1.0
AK342   NF5Q7   0.36    1.0
AK342   NF6Q7   0.46    1.0
AKPP1   AKPP2   0.36    1.0
AKPP1   AKPP4   0.5     1.0
AKPP1   AKUT1   0.23    1.0
AKPP1   NF1Q6   0.61    1.0
AKPP1   NF2Q6   0.34    1.0
AKPP1   NF5Q6   0.52    1.0
AKPP1   NF5Q7   0.39    1.0
AKPP1   NF6Q7   0.36    1.0
AKPP2   AKPP4   0.4     1.0
AKPP2   AKUT1   0.21    1.0
AKPP2   NF1Q6   0.67    1.0
AKPP2   NF2Q6   0.33    1.0
AKPP2   NF5Q6   0.52    1.0
AKPP2   NF5Q7   0.39    1.0
AKPP2   NF6Q7   0.52    1.0
AKPP4   AKUT1   0.52    1.0
AKPP4   NF1Q6   0.61    1.0
AKPP4   NF2Q6   0.27    1.0
AKPP4   NF5Q6   0.75    1.0
AKPP4   NF5Q7   0.32    1.0
AKPP4   NF6Q7   0.29    1.0
AKUT1   NF1Q6   0.69    1.0
AKUT1   NF2Q6   0.34    1.0
AKUT1   NF5Q6   0.64    1.0
AKUT1   NF5Q7   0.43    1.0
AKUT1   NF6Q7   0.3     1.0
NF1Q6   NF2Q6   0.98    1.0
NF1Q6   NF5Q6   0.99    1.0
NF1Q6   NF5Q7   0.95    1.0
NF1Q6   NF6Q7   0.88    1.0
NF2Q6   NF5Q6   0.89    1.0
NF2Q6   NF5Q7   0.44    1.0
NF2Q6   NF6Q7   0.64    1.0
NF5Q6   NF5Q7   0.97    1.0
NF5Q6   NF6Q7   0.92    1.0
NF5Q7   NF6Q7   0.85    1.0


> principal_coordinates.py -i compar_div_rare11422/ -o compar_div_rare11422_PCoA/


> make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_weighted_unifrac_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_WU/


> make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_unweighted_unifrac_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_UU/


> make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_binary_sorensen_dice_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BSD/


> make_2d_plots.py -i compar_div_rare11422_PCoA/pcoa_bray_curtis_otu_table_final_rarefied11422.txt -m ../Primnoa_map.txt -o PCoA_2D_plot_BC/


> compute_core_microbiome.py -i otu_table_final_rarefied11422.biom -o otu_table_core


> summarize_taxa_through_plots.py -o Core_taxa_summary11422/ -i otu_table_core/core_table_100.biom 


> make_otu_heatmap.py -i Core_taxa_summary11422/core_table_100_L5.biom -o heatmap_core_family.pdf




1