Identification of novel mitosis regulators through data mining with human centromere/kinetochore proteins as group queries

Background Proteins functioning in the same biological pathway tend to be transcriptionally co-regulated or form protein-protein interactions (PPI). Multiple spatially and temporally regulated events are coordinated during mitosis to achieve faithful chromosome segregation. The molecular players participating in mitosis regulation are still being unravelled experimentally or using in silico methods. Results An extensive literature review has led to a compilation of 196 human centromere/kinetochore proteins, all with experimental evidence supporting the subcellular localization. Sixty-four were designated as “core” centromere/kinetochore components based on peak expression and/or well-characterized functions during mitosis. By interrogating and integrating online resources, we have mined for genes/proteins that display transcriptional co-expression or PPI with the core centromere/kinetochore components. Top-ranked hubs in either co-expression or PPI network are not only enriched with known mitosis regulators, but also contain candidates whose mitotic functions are not yet established. Experimental validation found that KIAA1377 is a novel centrosomal protein that also associates with microtubules and midbody; while TRIP13 is a novel kinetochore protein and directly interacts with mitotic checkpoint silencing protein p31comet. Conclusions Transcriptional co-expression and PPI network analyses with known human centromere/kinetochore proteins as a query group help identify novel potential mitosis regulators.


Background
Mitosis is a complicated cellular process involving extensive structural reorganizations in many subcellular compartments and a sequence of highly orchestrated events. The temporal and spatial changes in mitotic cells are tightly regulated to ensure high fidelity of genomic transmission during cell division. Mitosis is initiated by accumulation of active kinase complexes formed between mitotic cyclins (cyclin A and B in human) and master mitosis regulator CDC2 (or CDK1) at the G2/M transition [1]. The expression of many other mitosis regulators also peaks during G2/M phase, some of which share the common control by transcription factors such as Fox M1 and the DREAM complex [2,3]. Chromosome condensation appears during prophase, concurring with reorganization of microtubule cytoskeleton into mitotic spindles and separation of duplicated centrosomes to opposite sides of the nucleus. The activity of cyclin A/ CDC2 lasts until nuclear envelope breakdown when cyclin A is degraded [4]. The cyclin B/CDC2 complex, together with many other chromosome and microtubule associated proteins, promotes the formation of bipolar spindle and chromosome congression to the metaphase plate [1]. Cyclin B is then destructed after ubiquitylation by the anaphase promoting complex/cyclosome (APC/ C) [1,5]. Loss of cyclin B/CDC2 activity ensures unidirectional progression of mitosis [6]. Sister chromatids separate and move to the opposite spindle poles after anaphase onset. Chromosomes then decondense, and the nuclear envelope reforms during telophase. When cytokinesis is completed, the abscission occurs at the midbody between two daughter cells, the spindle is disassembled and cells flatten out into interphase morphology again.
The centromeres are specialized loci on chromosomes that form primary constrictions during mitosis. There are currently 18 known human proteins constitutively associating with centromeres throughout the cell cycle [7]. The kinetochores are macromolecular protein complexes built upon centromeres to connect with spindle microtubules [8]. Kinetochores are dynamic structures that are assembled and disassembled at each sister chromatid during each and every mitosis [8]. Kinetochores also harbour activities contributing to chromosome movement throughout mitosis and the spindle assembly checkpoint (SAC) [5]. The SAC monitors kinetochoremicrotubule attachment status to inhibit activation of the APC/C until the metaphase-to-anaphase transition. Under electron microscope centromeres and kinetochores are contiguous structurally, and both play important roles in regulating chromosome segregation. We will use the term "the centromere/kinetochore complex" in this paper to reflect the intertwined relationship between the two subcellular structures.
Although the knowledge of proteins localized at the centromere/kinetochore complex has increased exponentially in the past few years [7][8][9][10][11], we still have much to learn about the proteins that contribute to the spatial and temporal regulation of mitosis. As indicated in recent reports, without a "parts list" of critical molecular players, it is impossible to reach a comprehensive understanding about mitosis and its connections with tumorigenesis and cancer drug effects [9,10,12]. Recent genomics and proteomics research has revealed that proteins in the same functional modules are usually transcriptionally co-expressed and/or organized into clusters in PPI networks (e.g. [13][14][15]). Many publicly available bioinformatics resources have deposited data obtained from large scale co-expression profiling and PPI studies, and provided tools to retrieve and organize the data.
As an end user interested in taking advantage of the huge amount of genomics and proteomics data in the public databases, we first conducted exhaustive literature review and compiled a list of 196 human centromere/ kinetochore proteins, and selected 64 among them as "core" components with well-characterized mitotic functions. By interrogating and integrating available online resources using the 64 core proteins as a query group, we have then identified potential novel mitosis regulators among top-ranking genes/proteins that co-express or interact with the 64 core centromere/kinetochore components. Experimental validation has identified one novel kinetochore protein TRIP13 and two novel centrosomal proteins KIAA1377 and DDX39.

Results and discussion
Compiling a comprehensive list of human centromere/ kinetochore proteins Several reviews have previously summarized known centromere/kinetochore proteins in human cells [7][8][9][10][11]16,17]. Due to rapid progress in the field and extensive use of aliases in literature, omissions in the published lists have become obvious; therefore we carried out an exhaustive literature search aiming to compile a comprehensive list of human centromere/kinetochore proteins. A protein is defined as centromere/ kinetochore-localized only when experimental evidence such as immunofluorescence or fluorescent protein fusions supported the claims (except several condensin and cohesin subunits, see below). As updated until April 30, 2012, a total of 196 human proteins correlating to specific genes have been localized at the centromere/kinetochore complex in published literature (Additional file 1: Table S1 and references therein). In addition, a phospho-specific epitope, recognized by monoclonal antibody 3 F3/2 [18], resides in kinetochore proteins. The epitope, generated at least partially by Plk1 kinase, is likely to be found at multiple centromere/kinetochore proteins [19,20]. Some proteins carrying 3 F3/2 epitopes have been determined (e.g. BUBR1) [21], but many remain to be characterized. Among all the centromere/kinetochore proteins, only two cohesin subunits encoded by Rec8 and STAG3 are meiosis specific. Not all condensin and cohesin subunits have been experimentally localized at the centromere/kinetochore complex, but both condensin and cohesin complexes are essential non-histone structural components along chromosomes, and play important roles in chromosome dynamics throughout the cell cycle, we therefore tentatively include all condensin and cohesin subunits as centromere/kinetochore proteins [22,23]. To facilitate future research on the centromere/kinetochore proteins, in Table S1 we included gene symbols, Entrez gene IDs and common aliases for each gene.
When compared to previous summaries, the compilation has significantly expanded the list of known human centromere/kinetochore proteins, from~120 to 196. The list still did not include all the subunits of several well-characterized protein complexes such as the dynein-dynactin complex and the γ-tubulin ring complex, both shown to associate with the centromere/ kinetochore [24]. Most likely the missing subunits are also targeted to the centromere/kinetochore as part of the protein complexes but as yet the localization has not been experimentally demonstrated. We will also report TRIP13 as a novel kinetochore protein below. A recent mass spectrometry based study estimated a total of~200 kinetochore proteins [12]. Our survey indicates that the human centromere/kinetochore is indeed a complicated structure with constitutive and transient components easily exceeding 200 proteins.
We compared our list of human centromere/kinetochore proteins to annotations in Gene Ontology (GO) (http:// www.geneontology.org/) [25]. The GO term search returned 49 categories containing "centromere" and 31 containing "kinetochore" in the GO titles or definitions. Not all these GO categories contain human proteins. A total of 247 human proteins have been annotated in GO to be involved in centromere-or kinetochore-related localization, functions or processes, of which 128 appeared in our list of centromere/kinetochore proteins (Additional file 2: Table S2). Among the remaining 119 GO annotated centromere/kinetochore proteins, some may participate in regulation of centromere/kinetochore functions but do not localize at the structure themselves (e.g. proteins encoded by SUGT1 and SENP6) [26,27]. Some others such as CENPBD1 were annotated by inference without experimental evidence. A few more genes (e.g. BAZ1B) have been localized in other species but not in human cells. Proteins listed in the later two categories are worthy of further exploration in order to completely catalogue human centromere/kinetochore components   (Additional file 2: Table S2). However, SS18L1 may be mistakenly annotated because the acronym of one of its alias, CREST, is identical to the commonly used autoimmune antibody to stain centromeres.
Genes co-expressing with 64 core centromere/ kinetochore components Transcriptional co-expression profiling has been extensively used to uncover functional gene modules involved in common biological processes [15,[28][29][30][31][32][33]. These modules can be detected through comparing the similarity of gene expression patterns using Pearson correlation coefficient or other metrics. Value-based and rank-based methods have been developed to construct the co-expression network. Advantages of the rank-based method have been discussed in recent publications [34,35]. In our initial efforts, we adopted a simple rank-based method to retrieve genes that co-express with a selection of 64 "core" centromere/kinetochore components (Table 1, also marked by asterisks in Additional file 1: Table S1). The 64 genes were chosen based on higher expression and/or better characterized functions in mitosis. The selection, when used collectively as a query group, is expected to reduce noises common in any association studies and enhance the specificity in retrieving mitosis-relevant genes. Upon querying the Human Gene Sorter with the 64 genes, we extracted and ranked a total of 3828 genes that appeared at least once in a combined list of "top 50" co-expressing genes (Additional file 3: Table S3, see Methods for details). Forty-six of the 64 queries were represented in the combined list, with a total of 422 occurrences (Table 1). Moreover, 111 of 196 known centromere/kinetochore proteins appeared 680 times in total. The representation of both groups was significantly higher than by chance (Z scores at 13.52 and 7.66 respectively, P < 0.001).
The genes that appeared more than 9 times in the combined co-expression list were summarized in Table 2. ZWINT, a well-characterized kinetochore protein, was represented 34 times in the combined list and ranked at the top [36]. Surprisingly, the second-ranking gene TRIP13 has never previously been associated with mitosis regulation but will be further characterized in this work as a novel kinetochore component (see below). GO term enrichment analyses of the genes in Table 2 indicated significantly higher representation of genes functioning in mitosis (Additional file 4: Table S4). A few categories of genes participating in DNA replication were also enriched in the results. Despite the possibility that the co-expression search may have recovered genes generally important for cell proliferation, it should be noted that at least some of the genes (e.g. RFC complex subunits-encoding CHTF18, RFC3, RFC4, RFC5) are critical for cohesion establishment during S phase [22,37]. As a further validation of the search strategy, we noticed certain genes in Table 2 have only been experimentally confirmed to participate in mitosis regulation in recent years, including those encoding centrosomal proteins STIL [38] and HMMR (RHAMM, [39]); centromere proteins RACGAP1 ( [40]) and SUPT16H (Spt16 subunit of FACT complex, [41]); and other recently identified mitotic proteins NUSAP (involved in spindle organization, [42]) and CKAP2 (spindle function, [43]). This group can also be extended to include ITGB3BP (encoding CENP-R), MLF1IP (encoding CENP-U), OIP5 (Mis18β), and HJURP, which all encode newly identified centromere proteins [44,45]. Although the later group of genes were among the 64 queries, they were so highly represented that they would still be retrieved if most of the constitutive centromere proteins (CENP-I to CENP-X) were omitted from queries. In addition, Table 2 and Additional file 3: Table S3 also contain several subunits of origin recognition complexes (ORC1L, ORC2L, ORC4L, ORC5L, ORC6L), γ-tubulin ring complexes (TUBGCP2, TUBGCP4, TUBGCP5), and condensin and cohesin complexes, supporting the roles of these complexes at kinetochores or in mitosis [24,46,47]. The genes co-expressing with 64 centromere/kinetochore components were further analysed using the CoExSearch program which accepts a group of queries to search and rank common co-expressing genes [48,49] (see Methods for details). As seen in Additional file 5: Table S5, 37 out of the 64 genes are found among the top 300 genes (no gene chip data for CENP-P in CoExSearch). In addition, the CoExSearch and Gene Sorter "top 300" lists share 123 genes, with TRIP13 among them (Additional file 6: Table S6). Due to coordination of many events in orchestrating mitosis progression, it should not be surprising that many among the 123 genes encode proteins that participate in different aspects of mitosis regulation, even though the query is a group of centromere/kinetochore proteins. Again, some among the 123 genes have only recently been associated with mitotic functions or structures such as FANCD2, FANCI and HYLS1 [50,51]. Future efforts will be directed to analyze those that still do not have defined mitotic functions (marked by "?" in Additional file 6: Table S6).
Proteins interacting with 64 core centromere/kinetochore components We then used POINeT website to obtain the "sub-network specific" PPI data with the 64 core centromere/kinetochore proteins collectively as a query group (see Methods for details) [52,53]. The tool was chosen mainly because it distinguishes a protein's total interactors from the interactors within a "sub-network" (in this case determined by the group of 64 query genes). The tool partially solved the problem of retrieving too many "false positive" interactors that may share no functions with the queries. Fifty-eight Table 3 Top-ranking non-query genes whose encoded proteins interact with 64 core centromere-kinetochore components  out of 64 queries returned interactors, with 452 nonredundant PPIs involving 352 interactors (including queries) retrieved. Top ranked non-query proteins that tend to interact with the core centromere/kinetochore proteins is presented in Table 3 and a full list in Additional file 7: Table S7. The functional relevance of the ranking was supported by several lines of evidence. First, the first 21 and a total of 44 in the list encode query proteins, indicating the clustering of search results. Second, some non-query proteins such as different isoforms of regulatory B subunit of phosphatase 2A (two encoded by PPP2R5A and PPP2R5D in the list) were recently shown to localize at kinetochores and affect kinetochore-microtubule interactions [54,55]. In addition, many among the top-ranked non-query genes encode known centromere/kinetochore or mitotic regulatory proteins such as PARP2, CBX5 (encoding HP1α), CCNB1 (encoding cyclin B1); microtubule subunits and associated proteins TUBA4A and MAPRE2 (encoding EB2); proteins involved in mitotic ubiquitylation and regulation: CDC16, ANAPC7, CDC27 (three APC/C subunits), FBXO5 (encoding Emi1, an APC/C inhibitor), MAD2L2 (another APC/C inhibitor) and PSMA3 (a proteasome subunit).
The mitotic functions of several other proteins are less well characterized but they were all reported in literature to interact with at least one of the 64 centromere/kinetochore proteins. Of interest, adapter proteins AP4B1, AP3B1 and AP2B1 interact with mitotic checkpoint kinases BUBR1 and BUB1, but the mitotic functions of the association have not been addressed [56]. However, clathrin has recently been shown to affect the spindle integrity during mitosis (for example, [57,58]), raising the possibility that these vesicle-trafficking proteins may indeed have mitotic functions. In addition, RECQL5 and BRCA2, two proteins involved in DNA repair, are of interest. BRCA2 has been localized at centrosomes and may affect genomic stability by altering centrosome behaviour [59]. Furthermore, KIAA1377 not only is localized at the midbody [52], but also interacts with kinetochore proteins ATRX, BMI1, CCDC99 (Spindly), MAD2L1BP (p31 comet ), PMF1 and other known mitosis regulators (http://www.ncbi.nlm. nih.gov/gene?term=kiaa1377). Interestingly, TRIP13, one of the top-ranked co-expressing genes, was also found to interact with p31 comet and several other mitosis regulators, although it did not make to the PPI top-ranking list mainly due to its large number of total interactors.

Comparison of co-expression and PPI search results
Nineteen genes/proteins are ranked high in both Gene Sorter co-expression and POINeT PPI lists (Tables 2 and  Additional file 7: Table S7), of which 16 were in the query list (BUB1, BUB1B, CDCA8, CENP-A, CENP-E, CENP-F, ITGB3BP, KIF2C, KNTC1, MAD2L1, NDC80, NEK2, PLK1, SPC25, TTK and ZWINT). The 3 non-query genes CCNA2, CCNB1 and CDC2 encode cyclin A2, cyclin B1 and CDC2 kinase. Similarly, comparison of co-expressing genes in Table S6 and PPI list in Table S7 found 21 common genes, with 18 in the query list (BUB1, BUB1B,  CDCA8, CENP-A, CENP-E, CENP-F, KIF2C, KNTC1,  MAD2L1, NDC80, NEK2, NUF2, PLK1, SGOL2, SPC25, TTK, ZWILCH and ZWINT) and 3 non-query genes: CCNA2, CCNB1 and BRCA2. The convergence of results to well-known mitosis regulators is encouraging, reflecting that the searches have generated functionally relevant results. How much overlap one should expect from parallel co-expression and PPI analyses is hard to predict as the two analyses are based on information at mRNA and protein levels, respectively; and the PPI coverage of the proteome is usually much lower compared to transcriptional profiling for the genome. It is acknowledged that high-throughput data contains intrinsic noises that affect both co-expression and PPI analyses [60][61][62]. In addition, many genes/proteins may participate in multiple biological processes. Using a group of functionally associated or subcellularly co-localized genes/proteins as queries has the potential to reduce database noises and selectively screen for candidates that are functional in a specific biological process or structure. Although we are mostly interested in using such "group query" strategy in combination with online resources to identify candidate mitosis regulators for experimental validation, refining the search strategy in collaboration with computer scientists and statisticians may prove useful to apply "partial knowledge" to obtain a more comprehensive understanding of a biological process or structure.

TRIP13 is a novel kinetochore protein
To further validate the results from bioinformatics studies, subcellular localization was determined for seven candidate genes as preliminary evaluation of their potential mitotic functions. The cDNAs were cloned as GFPfusions and were all confirmed to express at expected sizes (Additional file 8: Figure. S1).
As mentioned above, TRIP13 is a top-ranking gene that co-expresses with centromere/kinetochore components, but has not been associated with any mitotic functions. The yeast, worm and mouse homologs of TRIP13 have all been implicated in meiosis recombination [63][64][65]. However, TRIP13 is widely expressed in somatic tissues (information from Gene Sorter). Moreover, proteomic studies have found TRIP13 interacts with p31 comet , an important spindle assembly checkpoint silencing protein [66]. The interaction is conserved in both human and mouse cells [67,68]. TRIP13 encodes an AAA-ATPase, is overexpressed in many cancers and is hence listed in multiple cancer signatures [31,[69][70][71].
We first confirmed by GST-pulldown that GFP-TRIP13 associates with GST-p31 comet in cell lysates ( Figure 1A). Furthermore, purified recombinant GST-p31 comet and His-TRIP13 directly interact in vitro under physiologically relevant concentrations (our estimates of endogenous concentrations of p31 comet and TRIP13 are both around 100 nM) ( Figure 1B &C). In interphase cells GFP-TRIP13 is distributed in endoplasmic reticulum-like structures and partially localize at the nuclear envelope (data not shown). GFP-TRIP13 was observed to concentrate at kinetochores shortly after nuclear envelope breakdown, as evidenced by co-localization with BUBR1 and the centromere marker ACA (Figure 2A). In later prometaphase cells, GFP-TRIP13 was only observed at kinetochores that were also stained positive for SAC protein MAD2 ( Figure 2B&C). Also similarly as MAD2, GFP-TRIP13 disappears from kinetochores in metaphase and anaphase cells (data not shown). The kinetochore localization of GFP-TRIP13 seems independent of microtubules, as it remains co-localized with HEC1/hNdc80 in chromosome spread preparations made from nocodazole and hypotonic buffer treated cells ( Figure 2D&E). We concluded that TRIP13 is a novel kinetochore protein that interacts with p31 comet . As SAC silencing requires energy input that utilizes hydrolysis of the β-γ phosphoanhydride bond in ATP [72,73], we are testing whether TRIP13 as an AAA-ATPase facilitates p31 comet mediated checkpoint silencing.

KIAA1377 is a novel centrosomal protein
KIAA1377 was retrieved as a top-ranking non-query protein that associates with centromere/kinetochore proteins in the PPI analyses. KIAA1377 was previously shown to localize at the midbody [52]. GFP-KIAA1377 was confirmed to localize at midbody during cytokinesis, co-localizing with microtubule bundles adjacent to two Plk1-containing discs ( Figure 3A&B). In cells showing relatively higher expression, GFP-KIAA1377 was also observed to extensively overlap with microtubule network in interphase cells and the spindle in mitotic cells ( Figure 3C&D). Most interestingly, GFP-KIAA1377 colocalizes with centrin 2 at centrosomes throughout the cell cycle except on newly assembled daughter centrioles where GFP signals are absent ( Figure 3D, S/G2). In early G1 cells, the signals of GFP-KIAA1377 at centrosomes are dim when compared to those at the midbody; nonetheless they are reproducibly detectable ( Figure 3D, Early G1; Additional file 9: Figure S2, Early G1). Co-staining with γ-tubulin antibody confirmed the centrosome localization pattern of GFP-KIAA1377 (Additional file 9: Figure S2). The results indicate that KIAA1377 is likely a novel centrosomal protein. As mentioned above, KIAA1377 was reported in proteomics studies to also interact with p31 comet . Studies are ongoing to further clarify potential mitotic functions of KIAA1377.

Experimental characterization of other potential mitosis regulators
Five other co-transcription hits were also cloned as GFPfusions and examined for their subcellular localization. Three were retrieved from both Gene Sorter and CoEx-Search searches (Additonal file 6: Table S6), including PBK and MELK encoding protein kinases and CDKN3 encoding dual specificity phosphatase KAP that is closely related to CDC14 phosphatases [74]. DDX39, ranked as the 61 st in the Gene Sorter search (Table 2) and encoding a RNA helicase, and C4orf46, a gene of unknown function and ranked as 32 nd in the CoExSearch result (Additional file 5; Table S5), were also included for analyses. PBK/TOPK was previously shown to regulate cytokinesis [75]. KAP, originally discovered as a CDK inhibitor, interacts with CDK2 and CDC2 [76,77]. However, the subcellular localization of GFP-PBK and GFP-CDKN3 cannot be distinguished from that of GFP alone (Additional file 10: Figure S3). GFP is diffusely distributed in interphase cells with slightly higher accumulation in the nuclei. GFP alone was also observed to be concentrated at the midbody and, to a lesser extent, mitotic spindle (Additional file 10: Figure S3, first row). Lack of specific subcellular localization apparently does not preclude certain proteins from playing active roles during mitosis. A C-terminal tagged C4orf46-GFP shows similar localization in mitotic cells as GFP, GFP-PBK and GFP-CDKN3, although in interphase cells it is primarily localized in ER like structure and nuclear envelope (Additional file 10: Figure S3, bottom row).
GFP-DDX39 and GFP-MELK displayed more interesting subcellular localization. Although the bulk of GFP-DDX39 is diffuse in both the cytoplasm and nuclei, a fraction of the GFP signals co-localizes with γ-tubulin throughout the cell cycle ( Figure 4). Therefore, DDX39 is also a putative centrosomal protein and its possible functions at centrosomes or during mitosis warrant further investigation. It should be noted that RNA and RNA binding proteins have been found in centrosomes [78,79]. MELK was proposed to regulate G2/M transition by phosphorylating CDC25B [80]. More recently, Xenopus MELK was found to exhibit mitosis specific localization at the cell cortex and target to the presumptive site of cleavage furrow before any signs of ingression, suggesting a role in cytokinesis regulation [81]. Using nuclear localization of CENP-F as a G2 marker [82], we found that GFP-MELK is largely cytoplasmic in G1 cells, but partially translocated into nuclei in G2 cells ( Figure 5A, top row). However, in late G2/early prophase cells where discrete CENP-F foci can be discerned, the nuclear level of GFP-MELK is reduced again ( Figure 5A, bottom row). Similarly to what has been reported in the Xenopus system, GFP-MELK starts to accumulate at cell cortex upon mitosis entry and becomes more evident following metaphase-to-anaphase transition ( Figure 5B, 1-4). Intriguingly, one or two transient GFP-MELK bands were observed in the midzone of late anaphase cells ( Figure 5B, 5-6; Figure 5C), likely marking the presumptive cleavage site as in Xenopus cells.
The bands coalesces as cytokinesis progresses but cortical association of GFP-MELK remains until sometime in early G1 ( Figure 5B, 7-8). The dynamic localization pattern of GFP-MELK seems consistent with its proposed roles in G2/M transition and cytokinesis regulation. The apparent evolutionary conservation between Xenopus and human MELKs prompts further studies on their functioning mechanisms in regulating mitosis progression.

Conclusions
In conclusion, we have compiled so far the most comprehensive list of centromere/kinetochore proteins in human cells. Data mining of gene expression and PPI databases using the centromere/kinetochore proteins as queries have retrieved candidate novel mitosis regulators. Experimental validation has discovered two novel centrosomal proteins KIAA1377 and DDX39, and one novel kinetochore protein TRIP13. Functional characterization of these proteins will likely reveal novel mechanisms of mitosis regulation. We conclude that transcriptional coexpression and PPI network analyses with known human centromere/kinetochore proteins as a query group help identify novel mitosis regulators.

Literature search and gene ontology analysis
The list of human centromere/kinetochore proteins were first derived from a previous review [16], and then updated through exhaustive abstract search in PubMed. Tracking the references during full-text literature review and screening through Gene Ontology (GO) website (http://www.geneontology.org/) also contributed to the compilation. The last amendment of the list was made on April 30, 2012. For GO analysis, the human genes annotated with "centromere" or "kinetochore" in the GO terms or IDs were filtered with "H. Sapiens" species filter and downloaded in the "gene association format" into Microsoft Excel. The conversion between gene symbols and IDs was carried out using Gene symbol Gene ID converter [83] (http://idconverter.bioinfo.cnio.es/). The GO enrichment analysis was performed using FuncAssociate 2.0 (http://llama.med.harvard.edu/funcassociate/).

Transcriptional co-expression analysis
The transcriptional expression profiling data in the UCSC Human Gene Sorter (Mar. 2006 datasets) were used for transcriptional co-expression analysis [84]. The depository contains data of the human transcriptome in over 70 tissues and cell lines obtained on three microarray chips. Gene Sorter search returns each query with a list of genes ranked by similarity in expression patterns [85,86]. For each centromere/kinetochore query, the top 50 co-expressing genes on all chips were collected. A total of 155 "top 50" lists for the 64 core centromere/kinetochore components were then pooled. The Pivot Table function in Excel was used to count the occurrences of each gene in the combined list after removing queries. The co-expression was also analyzed later using the CoExSearch program (http://coxpresdb.jp/top_search. shtml#CoExSearch) [48,49] which accepts a group of queries to search for common co-expressing genes and rank them based on a co-expression measure "mutual rank" (MR) [49]. A total of 4401 microarray expression datasets (no overlap with data in Gene Sorter) was used for human gene co-expression analysis.
Protein-protein interaction network analysis PPI network analysis was performed using online tools provided at the POINeT website (http://poinet.bioinformatics.tw) [52,53,87], using the 64 "core" centromere/ kinetochore proteins as a query group. Only experimentally determined interactions were used to analyze the interactions. The POINeT website has imported data from several most popular PPI databases, and ranked the proteins based on a "subnetwork specificity score" (S3 score) reflecting their enrichment in a specific biological process defined by the query group. The scoring system consists of two parts. The first part examines the ratio of the subnetwork degree to the global degree of any given node. In other words, it compares the number of PPIs (degree) between a protein (node) and members of the query group to the total number of PPIs involving the protein. The second part compares the number of PPIs involving a certain protein with members of the query group to the number of PPIs between the same protein and 1,000 randomly generated groups of the same size as the query group [52,53].
Recombinant DNA, recombinant protein and In vitro protein binding assay DNA cloning was performed using the Gateway system (Invitrogen) [88,89]. Full-length cDNAs encoding selected proteins were amplified and cloned into pENTR-TOPO vector. The constructs were verified for DNA sequences and then recombined into different destination vectors for protein expression in E. coli or mammalian cells. Recombinant GST-p31 comet and 6×His-TRIP13 were expressed and purified as described before [88,89]. In vitro binding assay with recombinant proteins was performed essentially as in [89] except that Probond nickel beads (Invitrogen) was used for pull-down.

DNA transfection and immunofluorescence
DNA transfection and immunofluorescence was performed essentially according to [90]. HeLa-M or HEK293 cells were transfected using Fugene 6 (Roche) or TransIT -LT1 (Mirus) following the manufacturers' instructions. Cells were usually fixed 24~48 hrs after transfection in 3.5 % paraformaldehyde for 7 min, extracted with KBT (10 mM Tris-HCl, pH7.5, 150 mM NaCl, 1 mg/ml BSA and 0.2 % Triton X-100) for 5 min, and then blocked with KB (KBT omitting Triton X-100) for at least 5 min prior to immunofluorescence. In cases that γ-tubulin was probed, the cells were fixed and extracted simultaneously in 3.5 % paraformaldehyde containing 1 % Triton X-100 for 7 min, and then blocked with KB. The list of antibodies used can be provided upon request. The images were collected by a cooled CCD camera (CoolSNAP HQ2; Photometrics) equipped on an automated Olympus IX-81 microscope using a PlanApo 60× NA 1.42 oil objective with z-step mostly set at 1 μm. Image acquisition and analysis were performed using Slidebook software (Intelligent Imaging Innovations) and further processed in Adobe Photoshop for presentations.

Preparation of mitotic chromosome spreads
Transfected HeLa-M cells were harvested after 16 hr treatment with nocodazole (60 ng/ml final concentration) and swollen for 30 min in 75 mM KCl at room temperature. One millilitre of cell suspension was added to a 35 mm dish containing coverslips, and spun at 1,000 × g for 12 min in a Legend RT-Plus centrifuge (Thermo Scientific) on top of a 15 ml tube holder fitted inside a hanging bucket. The chromosome spreads were then fixed and processed for immunofluorescence. counterstained with DAPI (blue) and microtubules are stained with anti-α-tubulin antibody (red). Note microtubule staining is not always easily discernible because single focal plane images were shown, and the contrast is optimized to show the microtubule bundles at the midbody in cells undergoing cytokinesis. Bar = 10 μm.