当前位置:首页 >> 农林牧渔 >>

用R语言进行生物多样性分析


IAE

用R语言进行生物多样性分析

王绪高 中国科学院沈阳应用生态所

Institute of Applied Ecology, CAS

IAE

Diversity
Which one is more diverse?

Institute of Applied Ecology, CAS

IAE Common data formats:
(1) Presence/absence data, i.e., 0-1 data (2) Abundance data: number of individuals of each species (3) Cover/biomass data. Cover/biomass are measurements often used in plant ecology. Biomass is occasionally used in insect, marine ecology etc. But (2) and (3) can be converted into presence/absence data

There are many field methods for collecting the above data. Some common ones
include: 1. Quadrat sampling/transect line

2. Trapping (light, pitfall, suction): Mainly used to collect insects. The common form is abundance data. Trapping data cannot be easily linked to sampling area because we do not know the area base where the insects come from. 3. Sighting/hearing (for surveying birds, mammals): Collecting presence/absence data, not accurate for abundance (count). 4. Capture-remark methods: Birds, mammals, fishes.

Institute of Applied Ecology, CAS

IAE

A basic data form
spcode abund 1 ACACME 1 2 ADE1TR 23 3 AEGIPA 4 4 ALCHCO 37 5 ALLOPS 10 6 ALSEBL 231 7 AMAICO 1 8 ANACEX 4 9 ANDIIN 9 10 ANNOSP 4 11 APEIME 47 12 APEITI 4 13 ASPICR 10 14 AST1ST 42 15 AST2GR 13 16 BEILPE 77 17 BROSAL 48 18 CALOLO 14 19 CASEAC 3 20 CASEAR 15 ……

Institute of Applied Ecology, CAS

IAE
richness.

Diversity indices

Species diversity consists of two fundamental components: abundance and

Suppose x = (x1, x2, …, xS) is a sample of abundance of a community, where S is

the number of species.
1. Abundance: the number of individuals of a species in a given area. The total number of abundance is N = x1 + x2 + … +xS = sum(xi)

2. Richness: the number of species in a given area which is S.
These two components are not independent of each other, they are related.

Most diversity indices are quantitative combinations of abundance and richness
in such a way that richness is weighted by relative abundance of each species.

Institute of Applied Ecology, CAS

IAE
3. The Shannon index, H?
S

Diversity indices
H ' ? ? ? pi ln( pi )
i ?1

4. The Simpson index, D

D ? ? pi2

The Shannon and Simpson indices are the two most widely used in the literature. The Shannon weighs towards rare species , while the Simpson weighs towards the abundant species.

Institute of Applied Ecology, CAS

IAE

Diversity indices
5. The Margalef’s index

DMg ?

S ?1 ln N
S N

6. The Menhinick’s index

D Mn ?

7. The McIntosh index, D

D?

N ? N ?

?n
N

2 i

8. The Berger-Parker index, d

d ?

N max N

9. Brillouin index, HB

HB ?

ln N !? ? ln ni ! N Institute of Applied Ecology, CAS

IAE

Relationship among the indices Many indices are not independent but related. Hill demonstrates their relationship.
a a a Da ? p1 ? p2 ? ... ? p s

?

?

1 1? a

where Da is the a-th order of diversity, pi is the proportional abundance of the n-th species. It follows that D0 = number of species D1 = exponential Shannon index D1 ? e H ' D2 = the Simpson index D? = the Berger-Parker index Proof: the Shannon index at a -> 1 (use l’H?pital rule):

? p a log( p ) ? p a log( p ) ? ... ? p a log( p )) 1 2 2 s s ? 1 a a a ? log( p a ? p a ? ... ? p a ) ? ? p1 ? p 2 ? ... ? p s 1 2 s ?? exp Da ? exp ? ? ? ? 1? a ?1 ? ? ? ? ?

? ? ? ? ? ? ?

Hill, M.O. 1973. Diversity and evenness: a unifying notation and its consequences. Ecology 54:427-431.

Institute of Applied Ecology, CAS

Evaluation and choice of diversity indices

IAE
Two criteria of “good” indices: ? High discriminating power: The ability to detect subtle (not unduly) differences between samples. This is an important criterion because one of the major applications of diversity measures is to gauge the effects of environmental changes (pollution or other disturbances) on communities. Independent of sample size: This criterion is most commonly used to judge whether an index is satisfactory or not. A good index must be relatively independent (no indices are truly independent of sample size) of sample size so that one can make sure that the index estimated from relatively small samples will represent the true community.

?

There is little concensus on which indices are “good” (let alone “best”). In general, indices can be divided into two types:
Type 1-- Indices weighted towards species richness (or rarity): Richness, the Margalef, the Menhinick’s, the Shannon index, the Brillouin, logseries a index, and the lognormal l. Type 2 – Indices weighted towards species dominance/evenness (or abundance of species): the Simpson, the McIntosh D, and the Berger-Parker indices.

Institute of Applied Ecology, CAS

IAE

Sample
Quadrat Plot
100 Y coordinat e 0 20 40 60 80

? Single sampling ~ square
sample.ran.squ=function(data, side, plotdim=c(1000,500)) { xlo=runif(1,min=0,max=plotdim[1]-side) ylo=runif(1,min=0,max=plotdim[2]-side) xhi=xlo+side yhi=ylo+side randsample=subset(data, gx>=xlo&gx<xhi&gy>=ylo&gy<yhi) no.ind=length(randsample$sp) no.spp=length(unique(randsample$sp)) return(c(side^2/1e4,no.ind,no.spp))

0

20

40

60

80

100

X coordinate

}

Institute of Applied Ecology, CAS

IAE

Sample

? Single sampling ~ rectangular
sample.ran.rect=function(data, side.x, side.y, plotdim=c(1000,500)) { xlo=runif(1,min=0,max=plotdim[1]-side.x) ylo=runif(1,min=0,max=plotdim[2]-side.y) xhi=xlo+side.x yhi=ylo+side.y randsample=subset(data, gx>=xlo&gx<xhi&gy>=ylo&gy<yhi) no.ind=length(randsample$sp) no.spp=length(unique(randsample$sp)) return(c(side.x*side.y/1e4,no.ind,no.spp)) }

Institute of Applied Ecology, CAS

IAE
Single sampling ~ circle

Sample
sample.ran.circle=function(data, radius, plotdim=c(1000,500),graph=T) { xlo=runif(1,min=radius,max=plotdim[1]-radius) ylo=runif(1,min=radius,max=plotdim[2]-radius) num.ind=length(data$gx) place=numeric() dist=((data$gx-xlo)^2+(data$gy-ylo)^2)^0.5 dist.n=dist<=radius place=(1:num.ind)[dist.n] sample.new=data[place,] sample.new=sample.new[is.na(sample.new$tag)==F,] if(graph){ plot(data$gx,data$gy,xlab="x",ylab="y") points(sample.new$gx,sample.new$gy,type="p", col="blue") } cat("Random sample point is",c(xlo,ylo),sep=" ","\n") return(sample.new) }

Institute of Applied Ecology, CAS

IAE
? Nest sampling

Sample
################# spp.area=function(data,sides) { result=list() nrow=length(sides) for (i in 1:nrow) result[[i]]=spp.area.onetime(data,sides[i]) fullresult=matrix(nrow=0,ncol=5) for(i in 1:length(result)) { fullresult=rbind(fullresult,result[[i]]) } colnames(fullresult)=c("area","ind","spp","shannon","simpson") fullresult=data.frame(fullresult) par(mfrow=c(2,2)) plot(fullresult$area,fullresult$spp,col="red") lines(fullresult$area,fullresult$spp,col="green") plot(fullresult$area,fullresult$ind,col="blue") lines(fullresult$area,fullresult$ind,col="green") plot(fullresult$area,fullresult$shannon,col="black") lines(fullresult$area,fullresult$shannon,col="black") plot(fullresult$area,fullresult$simpson,col="black") lines(fullresult$area,fullresult$simpson,col="black") return(fullresult) }

spp.area.onetime=function(data,side) { randsample=subset(data,data$gx>=0&data$gx<side&data$gy>=0&data$ gy<side) no.ind=length(randsample$sp) no.spp=length(unique(randsample$sp)) abund=numeric() splist=unique(randsample$sp) for (i in 1:no.spp) {abund[i]=length(randsample$sp[randsample$sp==splist[i]]) } p=abund/sum(abund) shannon=-sum(p*log(p)) simpson=1-sum(p^2) return(c(side^2/1e4,no.ind,no.spp,shannon,simpson)) } ############

Institute of Applied Ecology, CAS

IAE

Sample
5×5 10×10 20×20 50×50

?A grid system sample.side=function(data,side,plotdim) { x=y=seq(0,plotdim[1]-side,by=side) n=length(seq(0,plotdim[1]-side,by=side)) x1=rep(x,each=n) y1=rep(y,n) loc=data.frame(x1,y1) no.ind=numeric() no.spp=numeric() for (i in 1:n^2) {randsample=subset(data,data$gx>=loc[i,]$x1&data$gx<(loc[i,]$x1+side)&data$gy>=loc[i,]$y 1&data$gy<(loc[i,]$y1+side)) no.ind[i]=length(randsample$sp) no.spp[i]=length(unique(randsample$sp))} return(data.frame(x=x1,y=y1,ind=no.ind,sp=no.spp)) }
100×100 250×250

Institute of Applied Ecology, CAS

IAE

Variogram

Given a geostatistical model, Z(s), its variogram g(h) is formally defined as
2 1 1 g ( h ) ? var ?Z (s) ? Z ( u ) ? ? ?? ?Z (s) ? Z ( u ) ? f (s, u )dsdu 2 2

where f(s, u) is the joint probability density function of Z(s) and Z(u). For an intrinsic random field, the variogram can be estimated using the method of moments estimator, as follows:

g? ( h ) ?

1 2 N ( h)

N ( h) i ?1

?

?z ( si ) ? z ( si ? h ) ?

2

where h is the distance separating sample locations si and si+h, N(h) is the number of distinct data pairs. In some circumstances, it may be desirable to consider direction in addition to distance. In isotropic case, h should be written as a scalar h, representing magnitude.

Institute of Applied Ecology, CAS

IAE

Variogram

The main goal of a variogram analysis is to construct a variogram that best estimates the autocorrelation structure of the underlying stochastic process. A typical variogram can be described using three parameters: Nugget effect – represents micro-scale variation or measurement error. It is estimated from the empirical variogram g? ( h ) at h = 0.
Range – is the distance at which the variogram reaches the plateau, i.e., the distance (if any) at which data are no longer correlated. Sill – is the variance of the random field V(Z), disregarding the spatial structure. It is the plateau where the variogram reaches at the range, g(range).
1.2

sill = 1.0
0.8

g(h)

0.4

nugget = 0.2
0.0
0 2 4

range h = 5
6 8 10

h

Institute of Applied Ecology, CAS

IAE

Variogram

result=sample.side(bci,side=20,plotdim=c(500,500)) result[1:4,] library(geoR) abund=result[,1:3] abund.geo=as.geodata(abund) dist=seq(0,500,20) abund.var=variog(abund.geo,dir=0,uvec=dist) ##compute semi-variance in other directions plot(abund.var)

spp=result[,c(1,2,4)] spp.geo=as.geodata(spp) spp.var=variog(spp.geo,dir=0,uvec=dist)

Institute of Applied Ecology, CAS

IAE

Variogram

library(geoR) Tri2pa=subset(bci,sp==“TRI2PA”&dbh>0) Tri2=Tri2pa[,3:5] Tri2.geo=as.geodata(Tri2) dist=seq(0,200,10) var=variog(Tri2.geo,dir=0,uvec=dist) #variog(coords=Tri2pa[,3:4],data=Tri2pa[,5],dir=0,uvec=dist) plot(var) variog4(Tri2.geo,uvec=dist)

Institute of Applied Ecology, CAS

IAE

Varpart

Institute of Applied Ecology, CAS

IAE

Varpart

library(vegan) ; varpart

Institute of Applied Ecology, CAS

IAE

Species-area relationship
Sampling species

? ? ? ? ? ? ? ? ? ? ? ?

?

? Number of species

Species-area curve

?

?

?

?

? area

Institute of Applied Ecology, CAS

IAE

Species-area relationship
s ? c ? z ln(a )

The generalized species-area model
Logarithmic model

ds da

?

f (s) a

?

a ? ? s ? gs 2 ? o ( s 3 )
a

Power model
Logistic model

s ? ca z
s? b c ? a?z

A special case of the logistic model (z = 1):

Michaelis-Menten

s?

ba 1 ? ca

The derivative describes the rate of change in the number of species with one unit of change in area. An important point to make: models change with the change of scale.
He, F. and Legendre, P. 1996. On species-area relationships. American Naturalist 148(4): 719-737

Institute of Applied Ecology, CAS

IAE

Species-area relationship

# fit the logarithmic model # logarithm.div.fn=function(data) {##nsp=c+z*ln(area)## area=data$area nsp=data$spp lm.out=lm(nsp~log(area)) lm.res=residuals(lm.out) res.square=sum(lm.res^2) prd.s=predict(lm.out) cat("\n Logarithm model:residuals^2 = ", res.square, "\n") return(list(prd.s=prd.s,sum=summary(lm.out))) } #######data here is the result of nest sampling####

Institute of Applied Ecology, CAS

IAE

Species-area relationship

# fit the power model # power.div.fn=function(data) {##nsp=b*area^(z) area=data$area nsp=data$spp nls.out=nls(nsp~b*area^z,start=c(b=10,z=0.5)) nls.res=residuals(nls.out) res.square=sum(nls.res^2) prd.s=predict(nls.out) cat("\n Power model: residuals^2 = ", res.square, "\n") return(list(prd.s=prd.s,sum=summary(nls.out))) } #######data here is the result of nest sampling####

Institute of Applied Ecology, CAS

IAE

Species-area relationship

# fit the logistic model # logist.div.fn=function(data) {##nsp=b/(a+area(-z))## area=data$area nsp=data$spp nls.out=nls(nsp~b/(a+area^(-z)),start=c(b=10,a=0.1,z=0.5)) nls.res=residuals(nls.out) res.square=sum(nls.res^2) prd.s=predict(nls.out) cat("\n Logistic model: residuals^2 = ", res.square, "\n") return(list(prd.s=prd.s,sum=summary(nls.out))) } #######data here is the result of nest sampling####

Institute of Applied Ecology, CAS

IAE

Species-area relationship

div.area.r=function(data) { area=data$area nsp=data$spp prd.log.s=logarithm.div.fn(data) prd.power.s=power.div.fn(data) prd.logist.s=logist.div.fn(data) par(mfrow=c(1,2)) plot(area,nsp, xlab="area of sample", ylab="number of species",type="b") lines(area,prd.log.s$prd.s,col="blue") lines(area,prd.power.s$prd.s,col="green") lines(area,prd.logist.s$prd.s,col="red") plot(log(area),nsp, xlab="log(area of sample)", ylab="number of species",type="b") lines(log(area),prd.log.s$prd.s,col="blue") lines(log(area),prd.power.s$prd.s,col="green") lines(log(area),prd.logist.s$prd.s,col="red") } #######data here is the result of nest sampling####

Institute of Applied Ecology, CAS

IAE

Species-abundance relationship/distribution (SAD)

? SAD is simply the abundance of each species in a community. In no community do species have equal abundance. Instead, communities are almost always found to have many rare species and a few common ones. Ecologists believe that SAD is an ecological and evolutionary product which reflects the interactions of species, the effect of habitat adaptation and population fitness. Therefore, each community should have its own characteristic SAD. ? SAD is important because
? It completely describes the structure (species composition) of a community. ? It provides evidence for understanding community assembly rules: how species come into coexistence? How species share resources? Why there are always many more rare species than common ones? Why some species have to be rare while others to be so abundant? ? Species-abundance data provide an only solid basis to examine biodiversity (because SAD contains all the information about abundance of each species).

Institute of Applied Ecology, CAS

IAE

[1], [2-3], [4-7], [8-16], …

[1]/2,[1]/2+ [2]/, [2]/+[3]+[4]/2, [4]/2+[5-7]+[8]/2, …

abund1=sort(abund.data,decreasing=T)

cdf.obs[i]=length(abund[abund<=x[i]])

Institute of Applied Ecology, CAS

IAE
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

SAD

SAD. Preston1.bin # plot RSA using Preston's 2^2 scale: (0, 1], (1, 3], (3, 7], (7,15],... ############################# #Volkov's method for splitting boundary binnings to right and left bins count1=function(abund.dat) ##[1]/2, [1]/2+[2]/2,[2]/2+[3]+[4]/4,[4]/2+[5-7]+[8]/2,….. { ############# ##data is bci$sp sp.abund=function(data) splist=unique(abund.dat);abund=numeric() { nsp=length(splist) ##data is bci$sp for (i in 1:nsp) sp=count1(data)$splist { abund=count1(data)$abund abund[i]=length(abund.dat[abund.dat==splist[i]]) spnum=length(sp) } Preston1.nsp=numeric() return(data.frame(splist=splist,abund=abund)) Preston2.nsp=numeric() } number=ceiling(log2(max(abund))) ######################################## for (i in 1:number){ for (j in 2:i+1) { Preston1.nsp[i]=sum(length(sp[abund>=2^(i-1)&abund<=(2^i-1)])) Preston2.nsp[1]=0.5*length(sp[abund==1]) Preston2.nsp[j]=sum(length(sp[abund>=2^(j-2)&abund<=(2^(j-1))]))-0.5*length(sp[abund==2^(j-2)])-0.5*length(sp[abund==2^(j-1)]) } } par(mfrow=c(2,2)) hist(abund,xlab="Abundance",ylab="Species frequency",main="RSA") plot(c(1:number), Preston1.nsp, xlab="Abundance class", ylab="Species frequency", type="h", main="RSA: Preston Binning 1", lwd=4) lines(c(1:number),Preston1.nsp,col="red") plot(c(1:(number+1)), Preston2.nsp, xlab="Abundance class", ylab="Species frequency", type="h", main="RSA: Preston Binning 2", lwd=4) lines(c(1:(number+1)),Preston2.nsp,col="green") plot(1:length(sp), log(sort(abund, decreasing=T)), type="o", xlab="Species sequence", ylab="log(abundance)", main="Dominance Curve") cat("\n observed number of species = ", spnum, "\n") return(list(preston1.bin=Preston1.nsp,preston2.bin=Preston2.nsp)) }

Institute of Applied Ecology, CAS

IAE

SAD~logseries distribution

? This model is first used by Fisher, Corbet & Williams (1943) to model speciesabundance pattern of Malayan butterflies and Lepidoptera (butterflies, moths) caught by a light-trap at Rothamsted Experimental Station. It represents the first attempt to describe the mathematical relationship between the number of species and the number of individuals in those species. It is one of the two best known SAD models. ? The frequency (the number of species) with n individuals in a community is

fn ?

ax n
n

n = 1, 2, 3, …

? where a and x are two parameters, a > 0 and 0 < x < 1 (in most cases x > 0.9), but they are not independent (we will show that in a moment). ? The terms of the logseries distribution are the number of species with one individual, two individuals, three individuals…:

ax

ax 2
2

ax3
3

….

Institute of Applied Ecology, CAS

Estimating methods

IAE MLE 1.
Here pn is the probability, no longer frequency.

pn ?

ax n
n

n = 1, 2, 3, …

L ( x ) ? ? f i ? f1 f 2 ... f S ?
i ?1

S

a x n1 a x n 2
n1 n2

...

ax nS
nS

?

a S x n1 ? n 2 ? ...? n S
n1n2 ... n S

l ( x ) ? S ln(a ) ? N ln( x ) ? ? ln( ni )
i ?1

S

? a ln(1 ? x ) ? 1
x ? 1 ? e ?1 / a

l ( x ) ? S ln(a ) ? N ln(1 ? e ?1 / a ) ? ? ln( ni )
i ?1

S

a (e1 / a ? 1) ?
x ? 1 ? e ?1 / a

N S

Institute of Applied Ecology, CAS

IAE on these terms, we can calculate the total number of species and total number of individuals. Based
1. Total number of species, S, is obtained by adding all the terms in the series:

S ? ax ?

ax
2

2

?

ax
3

3

? ... ?

ax
n

n

? ...

n n ?1

?

?

ax n

? ?a ln(1 ? x )

1 ? x ? x 2 ? x 3 ? ... ? x n ? ... ?

1 1? x

This summation reduces to the following equation (how?):
That is why it is called logarithmic series, while in fact is a power series dist. S = 1 if the logseries model is described as probability rather than frequency.

S ? ?a ln(1 ? x )

2. The total number of individuals:

N ? a ( x ? x 2 ? x 3 ? ... ? x n ? ...) ?

ax
1? x

N ? 1a x ? 2

ax 2
2 2

?3

ax3
3 3

? ... ? n

ax n
n n

? ...

N?

ax
1? x

N ? a x ? a x ? a x ? ... ? a x ? ...

Institute of Applied Ecology, CAS

IAE Estimating methods
1. Moment method (what that means?)

fn ?

ax n
n

n = 1, 2, 3, …

S ? ?a ln(1 ? x )

N?

ax
1? x

For bci abundance data : S = 321, N = 368122.

S ? a ln(1 ? 368122

N

a

)

321 ? a ln(1 ? x? N N ?a

a

)

Results: a = 34.62111, x = 0.99991

Institute of Applied Ecology, CAS

IAE
SAD~logseries

Parameter estimation

logseries.moment=function(alpha, abund) { ###sp.abund=count1(bci$sp); abund=sp.abund$abund # alpha is initial value, which is needed for the Newton method # estimate the parameters for the logseries distribution S=length(abund) # total number of species N=sum(abund) # total number of individuals optim.out=optim(alpha,optim.moment.fn, S=S,N=N) print(optim.out) alpha=optim.out$par x=N/(N+alpha) cat("\n alpha = ", round(alpha,5), "x = ", round(x,5), "\n") } ######################### # optimization function # ######################### optim.moment.fn=function(alpha,S,N) { # The function is: S=alpha*log(1+N/alpha) abs(S-alpha*log(1+N/alpha))
}

Institute of Applied Ecology, CAS

IAE
SAD~logseries

Parameter estimation

Maximum likelihood estimates logseries.mle.r=function(alpha, abund) { S=length(abund) N=sum(abund) optim.mle.out=optim(alpha,optim.mle.fn,,abund=abund) alpha=optim.mle.out$par x=1-exp(-1/alpha)

# total number of species # total number of individuals

cat("\n alpha = ", round(alpha,5), "x = ", round(x,5), "\n") } ######################### # optimization function # ######################### optim.mle.fn=function(alpha,abund) { S=length(abund) N=sum(abund) obj.fn=S*log(alpha)+N*(log(1-exp(-1/alpha)))-sum(log(abund)) -obj.fn }

Institute of Applied Ecology, CAS

IAE

SAD~ Lognormal distribution

Lognormal distribution
It is one of the two most widely used species-abundance model, first used by Preston in 1948. Different from the logseries model, it is a model for continuous random variable, i.e., abundance is considered as a continuous random variable (the model is therefore applicable to biomass and cover data). This assumption is approximately valid for species rich communities. It has been found that many real communities follow the lognormal distribution. This is particularly true for species rich communities such as tropical rainforests.
If y = ln(x) is normally distributed y ~ N(s2, m), then x = exp(y) is lognormal. It has the form:

f ( x) ?

1 2? sx

e

1 ? ln( x ) ? m ? ? ? ? 2? s ?

2

Institute of Applied Ecology, CAS

Estimation methods (MLE and moment estimate): IAE
1 ? ln( x ) ? m ? ? ? ? 2? s ? e
2

f ( x) ?

1 2? sx

where m and s2 are the mean and variance of the normally distributed y which are the mean and variance of the log-transformed abundance data.

The mean and variance of the lognormal distribution are:

mx ? e
2 sx

m ?s 2
s2

? (e

? 1) e

2 m ?s 2

Institute of Applied Ecology, CAS

IAE
BCI data:
f ( x) ? 1 2? sx

1 ? ln( x ) ? m ? ? ? ? 2? s ? e

2

Steps to estimate m and s2:

1. Log-transform the abundance data, 2. Calculate the mean and variance of the log-transformed data: m = 4.8194, s2 = 5.7998, 3. Substitute the mean and variance into the lognormal model:
1 ? ln( x ) ? 4 .82 ? ? ? ? 2? 2.408 ?
2

f ( x) ?

1 e 6.04 x

Institute of Applied Ecology, CAS

IAE

Parameter estimation

SAD~Lognormal distribution
mle.lognormal.r=function(data, mu, sigma) {###sp.abund=count1(bci$sp); data=sp.abund$abund # MLE estimation of lognormal distribution # There are 2 parameters: mean (mu) and std (sigma) startparam <- c(mu, sigma) out <- optim(c(startparam[1], startparam[2]), fn = lognormal.llik.obj, x=data) mu <- out$par[1] sigma <- out$par[2] AIC=out$value*2+2*length(startparam) cat(" AIC= ",round(AIC,5), "\n") cat(" mu = ", round(mu, 5), "\n") cat(" sigma = ", round(sigma, 5), "\n") } #################################################### # The objective of the normal likelihood function # #################################################### lognormal.llik.obj=function(param, x) { mu = param[1] sigma = param[2] s=length(x) obj.fn=-0.5*s*log(2*pi)-s*log(sigma)-sum(log(x))-0.5*(sigma^(-2))*sum((log(x)-mu)^2) -obj.fn }

Institute of Applied Ecology, CAS

IAE

Model validation

AIC (Akaike Information Criterion) ? AIC=-2L + 2k – L is the log-likelihood and k is the number of parameters in the models. ? AICc=AIC+2k(k+1)/(n-k-1) – For small sample sizes(n): n/k<40 ? ?AIC<2: equivalent; ? ?AIC 4-7: distinguishable; ? ?AIC>10: definitely different

Institute of Applied Ecology, CAS

IAE

Thank you!
Institute of Applied Ecology, CAS


相关文章:
R语言在生物统计学教学中的应用
生物统计课程是生命科学领域本科学生的必修课程,因其 理论性强、内容抽象、计算...中用到的函数进行简单归纳,并通过实例演示分析 r 语言生物统 计教学中的优势...
R语言在基因芯片数据处理中的应用
R语言在基因芯片数据处理中的应用_生物学_自然科学_...用 R 和 BioConductor 进行基因芯片数据分析(二):缺失...通过这些基因 的表达值(依照表达谱相似性加权)来...
R语言绘图:相关性分析及绘图展示
R语言绘图:相关性分析及绘图展示_生物学_自然科学_专业资料。利用R语言进行相关性分析及结果展示 相关性分析 gaom 在我们平时分析的时候,经常会遇到样品间的相关性...
利用R语言绘制韦恩图
利用R语言绘制韦恩图_生物学_自然科学_专业资料。利用R语言绘制韦恩图利用R 语言绘制韦恩图 (2012-12-16 21:10:22) 有关韦恩图的说明请参见维基百科,此处就不...
R语言绘图:聚类分析及绘图
R语言绘图:聚类分析及绘图_生物学_自然科学_专业资料。利用R语言进行聚类分析及结果展示 聚类树构建及绘图 gaom 在选择样品进行分析时,我们常常会根据样品的一些...
R语言常用函数
R语言常用函数_生物学_自然科学_专业资料。R 语言常用函数基本 一、数据管理 vector:向量 numeric:数值型向量 logical:逻辑型向量 character;字符型向量 list:列表 ...
R语言在时间序列中的应用
R语言在时间序列中的应用_生物学_自然科学_专业资料。时间序列分析在人口预测问题...二.时间序列的预处理通常得到一个观察值序列后首先要对其进行平稳性以及纯随机性...
R语言与机器学习(5)神经网络
R语言与机器学习(5)神经网络_计算机软件及应用_IT/...(ANN),简称神经网络,是一种模仿生物神经网络的结构...我们使用 nnet 函数分析 Vehicle 数据。随机选择半...
R语言绘图:PCA分析和散点图
R语言绘图:PCA分析和散点图_生物学_自然科学_专业资料。利用R语言进行PCA分析及结果展示 PCA 分析和散点图 gaom 今天主要跟大家演示一下简单的 PCA 分析,并且...
[赞]R语言教程笔记-入门级2--知其然
R 程序包是 R 功能扩展,特定的分析功能,需要用相 应的程序包实现。例如:系统...生物多样性计算 CRAN Task Views 中有对程序包的分类介绍 4.3 R 程序包安装 ...
更多相关标签: