Centromeres undergo dramatic changes in morphology through the cell cycle, alternating between an extended conformation during cell growth and a condensed form incorporating millions of base pairs during mitosis and meiosis. Recently, our laboratory used genetic analysis to define the regions that provide centromere function in Arabidopsis; these regions undergo far less meiotic recombination than the rest of the genome. We are expanding our analysis of centromeres, determining the relationship between the primary DNA sequence and the secondary structure. First, we are examining the methylation state of the entire centromere by digesting genomic DNA with methylation-sensitive enzymes and with DNA sequencing methods that directly detect methylated cytosine. These experiments demonstrate that methylation levels vary across the centromeres. We are now investigating whether these differences in methylation levels affect centromere function.
The centromeric regions that we defined contain a surprisingly large number of predicted genes, both in the areas immediately flanking the centromere as well as within the centromere itself. To date, we have identified 47 genes within the genetically defined centromeres that are, surprisingly, expressed at measurable levels. Each of these genes is represented only once within the Arabidopsis genome. At least one of these genes is necessary for Arabidopsis growth, and several others have known metabolic functions. The genes within the centromere regions are highly methylated, yet are highly expressed. These observations indicate that unique mechanisms for regulating gene expression are likely in place in the centromeric regions. Intriguingly, investigation of DNA methylation patterns in the Arabidopsis centromeres showed that methylation is biased toward one strand of the double helix. Extensive stretches of this type of methylation have not been reported previously in any organism. This pattern of methylation was restricted to the Arabidopsis centromeres.