An initial biochemical and cell biological characterization of the mammalian homologue of a central plant developmental switch, COP1

Background Constitutive photomorphogenic 1 (COP1) has been defined as a central regulator of photomorphogenic development in plants, which targets key transcription factors for proteasome-dependent degradation. Although COP1 mammalian homologue has been previously reported, its function and distribution in animal kingdom are not known. Results Here we report the characterization of full-length human and mouse COP1 cDNAs and the genomic structures of the COP1 genes from several different species. Mammalian COP1 protein binds to ubiquitinated proteins in vivo and is itself ubiquitinated. Furthermore, mammalian COP1 is predominately nuclear localized and exists primarily as a complex of over 700 kDa. Through mutagenesis studies, we have defined a leucine-rich nuclear export signal (NES) within the coiled-coil domain of mammalian COP1 and a nuclear localization signal (NLS), which is composed of two clusters of positive-charged amino acids, bridged by the RING finger. Disruption of the RING finger structure abolishes the nuclear import, while deletion of the entire RING finger restores the nuclear import. Conclusions Our data suggest that mammalian COP1, similar to its plant homologue, may play a role in ubiquitination. Mammalian COP1 contains a classic leucine-rich NES and a novel bipartite NLS bridged by a RING finger domain. We propose a working model in which the COP1 RING finger functions as a structural scaffold to bring two clusters of positive-charged residues within spatial proximity to mimic a bipartite NLS. Therefore, in addition to its well-characterized role in ubiquitination, the RING finger domain may also play a structural role in nuclear import.


Background
Arabidopsis seedlings display distinct morphologies when grown in the dark compared to the light. Light-grown seedlings develop photomorphogenically, characterized by short hypocotyls and open green cotyledons. In contrast, dark-grown seedlings undergo skotomorpho-genesis (or etiolation), typified by elongated hypocotols and closed cotyledons [1]. COP1 was first identified through genetic screens as a negative regulator of light regulated development in Arabidopsis [2]. Arabidopsis cop1 mutant seedlings are constitutively photomorphogenic even when grown in the dark, and the severe cop1 mutants cause lethality in the late seedling stage, indicating that COP1 is essential for plant development [2,3]. Arabidopsis COP1 (Arabidopsis thaliana COP1, AtCOP1) is essential for the proteasome-dependent degradation of two transcription factors, HY5 and HYH [4,5]. These two homologous bZIP-type transcription factors directly interact with AtCOP1 and are capable of binding to light-responsive promoters to activate the transcription of many target genes [5,6]. Genome wide micro-array analysis shows that AtCOP1 regulates most, if not all, of the light-responsive genes under various light conditions [7,8], substantiating the notion that AtCOP1 functions as a crucial developmental switch through targeting key transcription factors for degradation, thereby controlling the light-responsive gene expression and photomorphogenic development.
AtCOP1 contains three conserved structural domains: a RING finger at the amino terminus, a coiled-coil domain in the middle, and a carboxyl-terminal WD40 repeat domain [9,10]. Each of the three conserved domains has been shown to mediate protein-protein interactions [11][12][13]. The subcellular localization of AtCOP1 is regulated by light in a tissue specific manner [14,15]. The hypocotyl cell nuclei contain high levels of COP1 in the dark and reduced levels in the light, suggesting that the nucleocytoplasmic partitioning of AtCOP1 is adjusted by a lightresponsive mechanism [14,16]. The activity of AtCOP1 is at least in part regulated by its subcellular localization, as the degradation of HY5 is dependent upon the nuclear accumulation of AtCOP1 in the dark [4]. AtCOP1 was demonstrated to carry a single, bipartite nuclear localization signal located between the coiled-coil domain and the WD-40 domain (amino acid 294-314) and a cytoplasmic localization signal, which was mapped to a region partially overlapping with the RING finger and the coiled-coil domain (amino acid 67-117) [17]. Strikingly, AtCOP1 protein forms characteristic nuclear speckles when transiently expressed in onion epidermal cells or stably expressed in transgenic Arabidopsis [6,18]. The functional role of these speckles is currently unknown; however, a subnuclear localization signal consisting of 58 residues (amino acid 120-177) is required for their formation [19].
A partial cDNA clone homologous to AtCOP1 has been previously identified in mammals, containing all three conserved protein-protein interaction domains [20]. Remarkably, when expressed in plant cells, the mammalian COP1-reporter fusion protein exhibited a light-regulated nuclear localization pattern similar to AtCOP1 [20]. Although mammalian COP1 failed to rescue the Arabidopsis cop1 mutants, over-expression of the amino-terminal half of mammalian COP1 in Arabidopsis caused a dominant-negative phenotype, resembling the effect of over-expressing the corresponding fragment of AtCOP1 [20,21].
To understand the function of COP1 in animals, we conducted an initial molecular characterization of mammalian COP1 homologues. We report here the cloning of the full-length mouse and human COP1 cDNAs (Mus musculus COP1, MmCOP1; Homo sapiens COP1, HsCOP1) and the comparison of COP1 homologues from both animals and plants in terms of both protein sequences and genomic structures. Moreover, we show that mammalian COP1 is associated with ubiquitinated proteins in vivo and itself a substrate of ubiquitination. Finally, examination of the subcellular localization patterns of the mammalian COP1 reveals a unique type of nuclear import signal and a classic nuclear export signal, which might be important for regulation of the mammalian COP1 activity.

Characterization of full-length cDNAs of mouse and human COP1s
The previously reported partial MmCOP1 mRNA (Gen-Bank accession number AF151110.1) encodes a protein product that corresponds to the full-length AtCOP1, but without the ATG start codon at its 5' end [20]. Homology search against the updated public EST database with the 5' end sequence of the available partial MmCOP1 cDNA allowed the identification of human and mouse EST clones containing a predicted starting codon and a partial 5' UTR, and the subsequent construction of full-length MmCOP1 and HsCOP1 cDNAs (see Experimental procedures for details). The presence of stop codons just upstream of the predicted start codons and their comparable sizes with mRNA on northern blots (data not shown) suggest that they contain full-length coding capacity. The full-length MmCOP1 and HsCOP1 cDNAs (GenBank accession numbers AF151110.2 and AF508940) encode polypeptides of 773 and 771 amino acids respectively and share 97.4% identity (Fig 1A and data not shown). Compared to AtCOP1, the predicted full-length MmCOP1 and HsCOP1 proteins contain unique N-terminal glycine-serine rich extensions of about 70 amino acids (Fig. 1A).
The mammalian COP1 is expressed in many different tissues and organs in both embryos and adults, based on the available EST information. Mammalian COP1 EST clones have been found in aorta, B-cell, bladder, brain, breast, cervix, colon, germ cell, head/neck, heart, kidney, liver, lung, myotube, nerve, ovary, parathyroid, placenta, prostate, single cell zygote, spleen, testis, T-cell, tonsil, unferti-lized egg, uterus, and whole embryo from human, mouse and rat. Northern blot analysis with selected tissue types confirmed this ubiquitous expression pattern (data not shown).

Genomic structures of animal COP1s and the evolutionary implication
Using the predicted HsCOP1 cDNA sequence, BLAST searches against the draft human genome database re-

Figure 1
The general characteristics of the mammalian COP1 proteins and genes. (A) Predicted amino acid sequence of HsCOP1. The N-terminal glycine-serine rich extension is underlined. The RING finger domain is boxed. The coiled-coil domain is indicated by dotted lines. The WD40 repeats are double underlined by solid lines and dashed lines. The sequence data were deposited in GenBank/EMBL/DDBJ under the accession number AF508940. (B) The genomic structure of the HsCOP1 gene. The HsCOP1 gene is located on chromosome 1 at position 1q24+1 based on the ideogram. Exons, represented by boxes with the exon numbers indicated above, are distributed over a solid line representing introns, according to proportion. The length of the HsCOP1 gene is indicated above the gene structure diagram. The corresponding sequenced human genomic BAC clones, represented by solid lines, are aligned under the HsCOP1 gene diagram. The HsCOP1 genomic sequence data are available in the Third Party Annotation Section of the DDBJ/EMBL/GenBank databases under the accession number TPA: BK000438. (C) A phylogenetic tree of COP1 proteins from representative species of both animal and plant kingdoms. The phylogenetic tree is constructed by ClustalW software [57], using protein sequences corresponding to the first 200 amino acids of HsCOP1. The protein sequence of HsCOP1 (human) is available under GenBank accession number AF508940. The protein sequence of MmCOP1 (mouse) is available under GenBank accession number AF151110. The partial protein sequence of XlCOP1 (xenopus) was predicted from four xenopus EST clones (GenBank accession number BE506274, BE679680, BE026293 and BG345699). The partial protein sequence of GgCOP1 (chicken) was predicted from one chicken EST clone (GenBank accession number BM487594). The partial protein sequence of DrCOP1 (zebrafish) was predicted from one EST clone (GenBank accession number AI959013). The partial protein sequence of FrCOP1 (fugu) was predicted from scaffold 5023 of the draft fugu genomic sequence. The protein sequence of AgCOP1 (mosquito) was predicted from the scaffold CRA_x9P1GAV59LQ of the draft mosquito genomic sequence. The protein sequences of AtCOP1 (Arabidopsis), PsCOP1 (pea), LeCOP1 (tomato), InCOP1 (morning glory), and OsCOP1 (rice) are available under GenBank accession number L24437, Y09579, AF029984, AF315714, AB040053, respectively. vealed significant hits on chromosomes 1, 3, 9 and 18. Because only the homologous sequences on chromosome 1 match 100% with the HsCOP1 cDNA, we conclude that the functional human COP1 gene (designated HsCOP1) is located on chromosome 1q24.1 (Fig 1B), and the homologous sequences from chromosome 3, 9 and 18 probably represent COP1 pseudogenes, as they all seem to encode truncated proteins. The corresponding sequenced BAC clones on this chromosomal location (GenBank accession numbers AL359265, AL162736, AL590723, and AL513329) can be assembled together with the reference of HsCOP1 cDNA sequences ( Fig 1B). The HsCOP1 gene contains 20 exons and the transcribed region is about 263 kb in length (GenBank accession number TPA: BK000438, Fig 1B). We were also able to identify the fulllength genomic sequence of mosquito COP1 (Anopheles gambiae COP1, AgCOP1) and partial genomic sequences of MmCOP1 and fugu COP1 (Fugu rubripes COP1, FrCOP1). The genomic structures of the vertebrate COP1 genes appear to be generally conserved among themselves, but different from mosquito and plant COP1 genes (data not shown).
BLAST searches also have identified COP1 homologues from many other animal species, including zebrafish, frog, chicken, rat, cow, dog, horse, and pig. However, we have so far failed to find COP1 homologues in Drosophila, C. elegans, or yeast, whose whole genome sequences are available. It is interesting that COP1 is not found in Drosophila but in mosquito.
Sequence comparison reveals that the three structural motifs present in Arabidopsis COP1, including the RING finger, the coiled coil and the WD40 domains, exist in COP1 proteins from all other species. The glycine-serine rich Nterminal sequence appears to be unique to vertebrate COP1s, as it is missing in mosquito and plant COP1 proteins. However, it should be pointed out that the amino acids in this region are not well conserved and their functional role in vertebrates is unclear.
To reveal their evolutionary relationship, we generated a phylogenetic tree based on the amino acid sequences of the N-terminal RING finger containing fragments (all corresponding to the first 200 amino acids of HsCOP1) from representative COP1 homologues, as only partial COP1 sequences are available in the public database for some species. Similar phylogenetic trees were generated with or without the vertebrate specific N-terminal extensions, and a tree with the extension included in the analysis is shown in Fig 1C. As expected, plant and animal COP1s form distinct clades, and the homologies between COP1s from different species are in general consistency with evolutionary distances of these species (Fig 1C).

The association of mammalian COP1 with ubiquitinated proteins in vivo
A number of RING finger proteins have recently been shown to act as E3 ubiquitin ligases [24]. WD40 domains have also been frequently found in E3 complexes as substrate recruiting domains [25][26][27][28]. Despite compelling evidence from Arabidopsis implying that COP1 may mediate protein degradation [4,5], no direct evidence of COP1 participating in ubiquitination has been reported.
As an initial step to study a possible role of mammalian COP1 in ubiquitination, we transiently expressed FLAGtagged COP1 (FLAG-COP1) in human cell line 293 together with HA-tagged ubiquitin (HA-Ub), followed by immunoprecipitation with anti-FLAG antibodies. Immunoblot analysis of the immunoprecipitates with anti-HA antibodies detected multiple distinct bands and smear only when FLAG-COP1 and HA-Ub were co-expressed, in- dicating that at least a subset of COP1 associated proteins are ubiquitinated in human cells ( Fig. 2A).
To further investigate whether COP1 itself undergoes ubiquitination, a His-tagged ubiquitin expression construct (His-Ub) was co-transfected into the 293 cells with FLAG-COP1, followed by His purification under denaturing condition (in 8 M Urea). Under this condition, only those proteins covalently linked to His-Ub would be copurified. Immunoblot analysis of the precipitates with anti-FLAG antibodies detected a higher molecular weight smear only upon co-expression of FLAG-COP1 and His-Ub, therefore confirming that COP1 is indeed a ubiquitination substrate (Fig. 2B). Similar data were obtained when a deletion construct FLAG-N280 contains an intact RING finger domain was used (see late in Fig 5A). Though FLAG-N280 was not as heavily ubiquitinated as the fulllength FLAG-COP1, it clearly showed a ladder pattern, in which the size difference between neighboring bands of the ladder was consistent with the unit molecular weight of His-Ub (Fig. 2B). Therefore, mammalian COP1 not only associates with ubiquitinated proteins in vivo, but also itself a substrate for multi-ubiquitination.

The mammalian COP1 protein is localized to both the nucleus and the cytoplasm
The subcellular localization of AtCOP1 is regulated by light [14], a feature that is critical for its function [4,29]. Interestingly, the localization of mammalian COP1 protein expressed in plant cells can also be regulated by light [20], implicating a similar mechanism may operate for the nucleocytoplasmic partitioning of COP1 in mammalian cells.
To investigate this possibility, we first set out to examine the subcellular localization of endogenous COP1 in cultured mammalian cells. Polyclonal antibodies against a N-terminal fragment of MmCOP1 (amino acid 71-270) were generated in rabbits and affinity purified. Endogenous COP1 is not detectable by immuno-fluorescence with our anti-COP1 antibodies, probably due to a very low expression level. Instead we took a subcellular fractionation approach to study the localization pattern of endogenous COP1 protein. The purified anti-COP1 antibodies detected a unique band of about 90 kDa from HeLa whole cell lysates, compared to the preimmune serum ( Fig 3A). This 90-kDa band should represent the endogenous COP1 based on the observations that anti-COP1 antibodies harvested from two other rabbits detect the same molecular weight band ( Fig 3A) and RNAi against COP1 can specifically reduce the level of this band (our unpublished data). After subcellular fractionation with HeLa cells, each fraction underwent immunoprecipitation and subsequent immunoblotting with anti-COP1 antibodies. As shown in Fig 3B, the endogenous COP1 is localized predominantly in the nucleus, but small amount may also be present in the cytosol. Within the nucleus, COP1 is present in both the nucleoplasm (NP) and the nuclear envelope (NE) fractions, although COP1 is more enriched in the nucleoplasm (Fig 3B). Identical samples from all fractions were also probed with antibodies against other control proteins (Fig. 3B). As expected, Lamin A and C, components of nuclear lamina, were found predominantly in the NE fraction ( Fig 3B). This fraction was also highly enriched in nuclear pore complexes, as evidenced by the enrichment of Nup62. Nup62 was detectable in smaller amounts in the cytosol and NP fractions as well (Fig 3B), consistent with its cellular distribution described previously [30]. On the other hand, Hsp110 was found predominantly in the cytosol fraction, while Hsp90 was found both in the cytosol and the NP fractions ( Fig 3B). Two subunits of the COP9 signalosome, CSN2 and CSN6, were located both in the cytosol and the NP fractions, but not in the NE fractions ( Fig 3B). These controls thus validated our fractionation results on COP1 subcellular distribution.
As another line of experiments, we generated expression constructs containing N-terminal GFP-and FLAG-tagged MmCOP1 and HsCOP1. As shown in Fig. 3C, the subcellular localization pattern of GFP-COP1 varies when expressed in COS7 cells. In general, their localization patterns can be categorized into three types. In Type I cells, GFP-COP1 is expressed mostly in the cytoplasm, with some enrichment around the nucleus (Fig 3C, Type I). In Type II cells, GFP-COP1 is expressed in both the cytoplasm and the nucleus, and also on the NE (Fig 3C, Type II). In the third type, GFP-COP1 protein is expressed mostly in the nucleus, both in the NP and on the NE ( Fig  3C, Type III). All three types of localization patterns are well represented among the transfected cells; their relative proportions vary in each experiment, but do not seem to correlate with the expression level of GFP-COP1. Similar results were obtained when other cell lines (including HeLa, NIH3T3, and CHO cell lines) were used or immuno-fluorescence studies were carried out with FLAG-COP1 instead of GFP-COP1 (data not shown).

The full-length mammalian COP1 protein can localize to the nuclear envelope
To test if COP1 is tightly bound to the NE, we extracted cells with 1% Triton X-100 before fixation. This treatment was shown to remove non-NE bound proteins from both the cytoplasm and the nucleus [31]. Indeed after Triton extraction, GFP-COP1 protein was completely eliminated from the cytoplasm and the nucleoplasm (Fig 3C, +Triton). Only NE-bound GFP-COP1 protein was retained, as characterized by a distinct RING-like pattern around the nucleus. On the other hand, GFP control was completely removed from the cells by the same treatment (data not shown), indicating that the retaining of GFP-COP1 on the NE is mediated by COP1 and not by the GFP tag. Similarly, FLAG-COP1 is also remained on the NE after Triton extraction (data not shown). The different localization patterns in different populations of cells imply that mammalian COP1 may partition among the NP, NE, and cytoplasm compartments.

Human COP1 is part of protein complexes of over 700 kDa
Because HsCOP1 is enriched in the nucleoplasm fraction (Fig 3B), we used this fraction from HeLa cells for a gel filtration analysis to examine possible form(s) of HsCOP1 in human cells. Endogenous HsCOP1 was found to accumulate primarily in large complexes of over 700 kDa, whereas no monomeric or dimeric form of HsCOP1 was detected (Fig 4). This finding is distinct from the observation for AtCOP1, which exists mainly as a homodimer [32]. In addition, the HsCOP1 complexes differ from the HsCOP9 signalosome complexes in size, as suggested by the different peak fractions (Fig 4). Finally, transiently expressed FLAG or GFP tagged HsCOP1 or MmCOP1 cofractionates with endogenous COP1 (data not shown), suggesting the behaviors of these fusion proteins most likely represent the endogenous function of COP1 protein.

Both the nuclear and the cytoplasmic localization signals of the mammalian COP1 are present in the N-terminal region
As a first step to locate the signals responsible for the subcellular localization of mammalian COP1, we made a series of deletion constructs of GFP-MmCOP1 ( Fig 5A). As shown in figure 5A and 5B, GFP-∆N70 (amino acid 71-733), which deleted the vertebrate specific N-terminal extension, also exhibited three general localization patterns in transfected cells, mostly cytoplasmic, in both compartments, or mostly nuclear, similar to full-length GFP-COP1 ( Fig 3C). However, GFP-∆N70 does not display NE localization pattern, indicating that the first 70 amino acid of glycine-serine rich region may be responsible for targeting MmCOP1 to the NE (Fig 5A and 5B). Similarly, AtCOP1 does not contain this region and is not located to NE [14].
Unlike GFP-∆N70, GFP-∆RING (amino acid 226-773) was only found in the cytoplasm in all the cells examined (Fig 5A and 5B), suggesting that the NLS of MmCOP1 is probably located within amino acid 71 to 226 of MmCOP1. GFP-N346 (amino acid 71-416) contains the RING finger, the coiled-coil domain, and the region corresponding to the AtCOP1 NLS [17] and is localized exclusively in the nucleus, forming distinctive nuclear speckles (Fig 5A and 5B). GFP-N280 (amino acid 71-350), which contains the RING finger and the coiled-coil domain, but not the sequence corresponding to the NLS in AtCOP1, is also solely localized in the nucleus (Fig 5A and 5B). However, unlike GFP-N346, GFP-N280 does not form speckles (Fig 5B). These results show that the region from amino acid 351 to 416 is not necessary for nuclear localization but contains a signal for speckle formation. It is worth noting that deletion constructs of AtCOP1 covering the corresponding region were found localized predominantly in the cytoplasm, distinct from the mouse GFP-N280 fusion proteins [17].
In contrast to GFP-N346 and GFP-N280, GFP-N200 (amino acid 71-270), which contains an intact RING finger and a partial coiled-coil domain, is localized entirely in the cytoplasm (Fig 5A and 5B). Similar observations were made when other cell lines were used (HeLa, NIH3T3 and CHO cells) or by using FLAG-tagged construct series instead of GFP fusions (data not shown). Taken together, both the major nuclear and cytoplasmic localization signals seem to reside in the N-terminal region of MmCOP1, most likely between amino acid 71 and 350.

Identification of a classic leucine-rich nuclear export signal in mammalian COP1
Because both GFP-N200 and GFP-∆RING are localized exclusively in the cytoplasm, the cytoplasmic localization signal is probably located within the overlapping region of these two constructs (amino acid 226-270). Interestingly, a sequence (LANVNLMLELL, amino acids 237 to 247), located within the N-terminal region of the coiled- coil domain, matches classic leucine-rich nuclear export signals (NES), such as those from Rex and PKI (Fig 6A) [33,34]. In order to test if this sequence is a bona fide NES, we made mutations at two conserved leucine residues (L234A and L236A) in the GFP-N200 construct (Fig 6A). The mutant protein GFP-N200 NESmut , in contrast to wildtype GFP-N200, is entirely nuclear localized (Fig 6B). Furthermore, treatment of wild-type GFP-N200 with CRM1/ exportin specific inhibitor Leptomycin B (LMB) also causes the protein to be localized exclusively in the nucleus (Fig 6B). These results confirmed the identified sequence (LANVNLMLELL) within the coiled-coil domain of mammalian COP1 as a likely CRM1/exportin specific NES. Notably, the corresponding site in AtCOP1 is within the mapped cytoplasmic localization sequence [17]. Moreover, the NLS must also be located within amino acids 71

Figure 6
Mapping the nuclear export signal of mammalian COP1 protein (A) Diagram of the amino acid sequence of the wildtype COP1 NES, the alignment with the NESs of Rex and PKI. The amino acid replacements in the mutant protein (N200 NESmut ) are indicated by arrows. The conserved key residues are shown in boldface. Residues altered by site-directed mutagenesis are highlighted in red. (B) GFP fluorescence study of the subcellular localizations of GFP-N200 and GFP-N200 NESmut . The studies were carried out in COS7 cells. When indicated, cells were cultured in the presence of 10 ng/ml LMB for 3 hrs before fixation. LMB, leptomycin B. The same magnification was used for all images and the scale bar for 10 micrometer was indicated in the N200+LMB image.

Identification of amino acid clusters critical for mammalian COP1 nuclear import and speckle formation
Nuclear import/localization signals (NLS) are usually composed of one or two clusters of positively charged amino acids. To map the nuclear import signal in mammalian COP1, we first identified three clusters of positively charged amino acids within the amino acid 71-270 for mutagenesis. Furthermore, we also chose to mutate a forth positively charged amino acid cluster at amino acid 358 to 360, which corresponds to the Arabidopsis COP1 NLS [17]. Since GFP-N346 exhibited clear and consistent nuclear localized pattern with speckles most resembling the full-length MmCOP1 nuclear localization pattern, it was used as the starting point for the site-directed mutagenesis studies (Fig 7A).
Unlike wild-type GFP-N346, GFP-N346 NLSmut1 (R113S/ K114N) and GFP-N346 NLSmut3 (K205T/R206S/K208T) are totally localized to the cytoplasm, still forming speckles ( Fig 7B). GFP-N346 NLSmut2 (K197N/K199N/R201S) is still localized to nucleus speckles, similar to wild-type GFP-N346 (Fig 7B). Comparable results were attained when the same mutations were introduced into GFP-N280 fusion protein (data not shown). Mutations of the fourth site (R358S/K360T) within GFP-N346, GFP-N346 NLSmut4 , do not change the nuclear localization of GFP-N346, but abolish the speckles (Fig 7B). Consistent with this, GFP-N280, which does not contain site 4, does not form nuclear speckles (Fig 5B and 7B). These results demonstrate both sites 1 and 3 are required for nuclear import, while site 4 is important for targeting N346 to the nuclear speckles. However, we do not know the functional significance of the nuclear speckles at this time. Further, we do not know whether the cytoplasmic speckles formed by GFP-N346 NLSmut1 and GFP-N346 NLSmut3 are in any way related to the nuclear speckles of wild-type GFP-N346.

A novel RING finger bridged bipartite nuclear localization signal is responsible for mammalian COP1 nuclear localization
In a typical bipartite NLS, the distance between the two clusters of positively charged amino acids is usually around 10 amino acids. However, the distance between site 1 and site 3 in GFP-N346 is more than 90 amino acids apart, separated by the RING finger domain (Fig 7A). To determine whether the RING finger plays a role in mediating nuclear import, we mutated two key zinc-binding cysteine residues at the RING finger domain of GFP-N346 (C158A/C161A, N346 RINGmut ) to destroy the RING finger structure (Fig 7A). Indeed, these mutations abolish the nuclear localization of GFP-N346 (Fig 7B). The mutated proteins are located in cytoplasmic speckles similar to the N346 NLSmut1 and N346 NLSmut3 (Fig 7B).
The RING fingers are characterized by four conserved cysteine-cysteine or cysteine-histidine pairs, which bind to two zinc molecules [ [35,36]; Fig 7C]. In particular, cysteine pair 1 and pair 3 bind to one zinc, while cysteinehistidine pair 2 and cysteine pair 4 bind to the other zinc, thereby forming a unique cross-brace structure [ [36]; Fig  7C]. Cysteine-histidine pair 2 and cysteine pair 3 are separated by two amino acids. Although the sequences other than the zinc binding sites are not well conserved, the overall three dimensional structures of all the characterized RING fingers are overwhelmingly similar, due to the conserved spacing between cysteine-histidine pair 2 and cysteine pair 3 and the tight zinc binding ability [37][38][39][40][41][42]. One common feature is that the RING finger structure brings the flanking N and C-peptides close together. So possibly, the role of COP1 RING finger in nuclear import is to function as structural scaffold by bringing the two clusters of positively charged amino acids (site 1 and site 3) within the right spatial proximity to fit into the binding sites of the nuclear import machinery, as illustrated by Fig  7C. To test this hypothesis, we deleted the RING finger in GFP-N346 (∆119-197) and physically put the site 1 and site 3 within a distance of 10 amino acids ( Figure 7A). This mutant protein, GFP-N346 ∆RING , maintained the same localization pattern as wild-type GFP-N346 protein ( Fig  7B). Therefore, mammalian COP1 appears to contain a novel nuclear import signal, composed of two clusters of positively charged amino acids bridged by a RING finger.

Evidence for a role of COP1 in protein ubiquitination in vivo
Because of its role in mediating the degradation of HY5, HYH, and possibly other transcription factors in Arabidopsis [4,5] and the presence of a RING finger domain and a WD-40 domain, AtCOP1 has been proposed to function as a RING-finger type ubiquitination E3 ligase, by recruiting E2 through its RING-finger domain and binding to the substrates through the WD-40 repeats [4,43]. This hypothesis is also supported by the findings about other negative regulators of plant photomorphogenesis. COP10, which has been shown to interact with AtCOP1 in yeast two-hybrid experiment, encodes a ubiquitin conjugating enzyme (E2) variant [44]. The COP9 signalosome, an eight-subunit complex homologous to the lid subcomplex of the 26 S proteasome [45], directly associates with SCF E3 complex and promotes deneddylation of the cullin subunit of SCF in both plant and animal cells [46][47][48][49]. The exact relationship between COP1 and the COP9 signalosome remains obscure; it is known, however, that in Arabidopsis the COP9 signalosome is required for COP1 to accumulate in the nucleus in the dark [29]. Although COP1 most likely plays a role in regulating protein degradation, attempts in reconstituting ubiquitination of HY5 by AtCOP1 have been unsuccessful. Instead, CIP8, an AtCOP1 interacting RING-H2 finger protein, is found to ubiquitinate HY5 in vitro [50,51]. Therefore, it is possible that COP1 may recruit other RING finger proteins, such as CIP8, to mediate the ubiquitination of specific substrates. The existence of mammalian COP1 in large complex forms is consistent with this possibility. In this study, we have demonstrated the mammalian COP1 is associated with ubiquitinated proteins and itself a substrate for ubiquitination in vivo. However, we have failed to reconstitute the ubiquitination activity of mammalian COP1 in vitro. Purification of the mammalian COP1 complex may hold a key to understand how COP1 mediates protein ubiquitination and degradation in the cells.

Molecular basis for mammalian COP1 subcellular localization and its possible regulation
Endogenous mammalian COP1 is found in the cytosol, NP, and NE fractions in our subcellular fractionation study, with majority of the protein in the NP fraction ( Fig  4A). In our fluorescence localization studies with multiple mammalian cell lines, transiently over-expressed COP1 is prevalent in the cytoplasm, but also on the NE and within the nucleus (Fig 4B). This discrepancy is probably due to the over-expression of mammalian COP1 in the latter system. It is worth noting that endogenous COP1 appears to be expressed at very low levels in all of the mammalian cells we have tested.
Through systematic mutagenesis studies, we have been able to locate both the NLS and the NES to the N-terminal region of the mammalian COP1 protein (Fig 6 and Fig 7). Deletion constructs, GFP-N346, GFP-N280 and GFP-N200, all contain both the NLS and the NES, but their subcellular localization patterns are strikingly different. GFP-N346 and GFP-N280 are localized exclusively to the nucleus, while GFP-N200 is localized entirely to the cytoplasm (Fig 5). However, treatment with LMB completely shifts GFP-N200 from the cytoplasm to the nucleus, implying GFP-N200 may shuttle between the nucleus and the cytoplasm, consistent with the existence of both the NES and the NLS in GFP-N200 (Fig 6 and Fig 7). The different domain compositions of GFP-346, GFP-N280 and GFP-N200 may explain the distinct localization patterns of these mutant proteins. NES is located within the coiledcoil domain (Fig 6), so it is plausible that the intact coiledcoil structure in GFP-N346 or GFP-N280 masks the NES, possibly through interactions with other proteins. On the other hand, the truncated coiled-coil domain in GFP-N200 is unable to shield the NES from binding to CRM1/ exportin. Since GFP-N200 is primarily localized to the cytoplasm, the NES is likely dominant over the NLS in mammalian COP1 protein. It is tempting to speculate that, under certain conditions, the mammalian COP1 NES is shielded within the coiled-coil structure and COP1 protein is imported into the nucleus through the RING finger bridged NLS; under other conditions, probably through a conformation change of the coiled-coil domain, the NES is exposed, overwriting the NLS and transporting mammalian COP1 back into the cytoplasm. However, we cannot rule out the possibility that a second set of unidentified NLS between amino acid 271-350 may be responsible for the nuclear localization of GFP-N280 and GFP-N346.
This study reveals a novel type of nuclear localization signal within the mammalian COP1 N-terminal region, consisting of a two clusters of positively charged amino acids bridged by a RING finger domain (Fig 7). Both clusters of positively charged amino acids are required and can function as a classic bipartite NLS when placed within the right distance (Fig 7). Disrupting the structure of the RING finger abolishes nuclear import, but removing the entire RING finger restores nuclear import (Fig 7).
A RING finger sandwiched between a bipartite NLS could provide additional regulatory mechanisms for nuclear import. In order to be transported into the nucleus, it is necessary for a protein to bind to the nuclear import machinery, such as importin α/β, through its NLS. Our studies suggest the mammalian COP1 RING finger acts as a structural scaffold by placing the two clusters of positive residues within the right spatial proximity for binding to the nuclear import machinery. Since the RING finger is a protein-protein interaction domain, interaction of the mammalian COP1 RING finger with another protein may, under certain conditions, trigger a conformational change of the surrounding peptides, altering the proximity between the two positive residue clusters of the NLS to disrupt binding to the nuclear import machinery. Alternatively, recruitment of another protein to the mammalian COP1 RING finger may sterically hinder access of its NLS to the nuclear import machinery.

RING finger -ubiquitination and/or subcellular localization?
Within the past few years, the RING finger motif has emerged as a core structure feature for many ubiquitin ligases (E3s) [52]. In most cases, RING fingers function as E2 recruiting domains, and by directly interacting with E2s, bring E2s in the vicinity of ubiquitination substrates, which bind to other parts of the RING finger proteins or other subunits of the RING finger protein containing complexes. Although most evidence suggests that RING fingers may not directly participate in the enzymatic trans-ferring of ubiquitin from E2s to the substrates, intact RING structures are essential for this function. In addition to the key roles in E3 ligase activity, RING finger domains have also been shown to be important for regulating subcellular localization of E3s and/or their substrates in a number of cases [53][54][55][56]. Studies have shown the MDM2 RING finger domain is required to promote nuclear export of p53 and this activity of MDM2 RING finger seems to be coupled with its ability to promote ubiquitination of p53 [53][54][55]. Rbx1/ROC1, the small RING finger subunit of the SCF E3 complexes, promotes the nuclear accumulation of Cul1, another subunit of the SCF complexes [56]. These studies have suggested a correlation between the ubiquitination activities of RING fingers and their roles in regulating subcellular localization of the E3 ligases which they are part of.
Our results suggest that mammalian COP1 might play a role in ubiquitination (Fig 2). If COP1 is indeed an E3, its RING finger is likely to be indispensable for this function, considering the numerous examples of other RING fingers. On the other hand, we have also demonstrated that the RING finger is required for the nuclear localization of mammalian COP1 (Fig 7). Thus, COP1 RING finger could also play dual roles in ubiquitination as well as nuclear import.

Distinction in subcellular localization signals between higher plant and mammalian COP1 proteins
Comparison of the subcellular localization signals uncovers clear distinctions between AtCOP1 and MmCOP1. In AtCOP1, the sequence responsible for the cytoplasmic localization is mapped to the region bordering both the RING finger and the coiled-coil domain, composed of 110 amino acids [ [17]; Fig 8]. In MmCOP1, we have identified a single 10-amino acid classic leucine-rich NES within the corresponding region (Fig 8), even though we have not eliminated the possible involvement of other sequences. The mammalian COP1 also has a unique N-terminal extension that seems to involve in localizing the protein into the nuclear envelope (Fig 8). We have defined MmCOP1 NLS as a RING finger mediated bipartite NLS variant, which is composed of 96 amino acids surrounding the RING finger domain (Fig 8). In contrast, the AtCOP1 NLS has been identified as a classic bipartite nuclear localization signal, consisting of 21 amino acids in the intermediate region of coiled-coil domain and the WD-40 domain [ [17] ; Fig 8]. Interestingly in MmCOP1, the site matching the AtCOP1 NLS is found to be important for targeting the nuclear localized MmCOP1 deletion construct N346 to speckles rather than mediating nuclear import (Fig 8). On the other hand, the site responsible for speckle formation is mapped within the cytoplasmic localization signal in AtCOP1 [ [19]; Fig 8]. One explanation for these discrepancies between AtCOP1 and MmCOP1 is that during the course of evolution, the animal and the plant COP1 proteins have adopted different regulatory mechanisms and probably different physiological and developmental functions. However, the protein sequences of the three major protein-protein interactive domains (RING, Coiled coil, WD-40 repeats) are highly conserved, possibly due to conserved biochemical activity of COP1 in all organisms. Another possibility is that different cell types may use completely different subcellular targeting signals. It is plausible that in some other mammalian cell types, subcellular targeting signals similar to Arabidopsis may be used. This speculation would be consistent with the fact that the subcellular localization of AtCOP1 is regulated by light only in certain types of plant cells [14]. Furthermore, the protein fragments responsible for regulating subcellular localization of AtCOP1 are quite conserved in MmCOP1, which may explain why the mammalian COP1, when expressed in onion epidermal cells, can be regulated by light in a similar manner as AtCOP1 [20]. However, the same nuclear translocation signal of AtCOP1 has been used in all plant cell types examined [17,19], which argue against the later hypothesis.

Conclusions
In this study, we cloned the full-length mouse and human COP1 cDNAs. Furthermore, we demonstrated that mammalian COP1 is associated with ubiquitinated proteins in vivo and itself a substrate of ubiquitination. Finally, through subcellular localization studies, we identified a unique type of nuclear import signal and a classic nuclear export signal within the N-terminal region of mammalian COP1, which might be important for regulation of the mammalian COP1 activity.

DNA constructs
To obtain the full-length MmCOP1 cDNA, an EST clone (GenBank accession number AI509164) containing the N-terminal region of MmCOP1 cDNA in pT7T3D-Pac vector was obtained from ResGen and digested with XhoI and Hind III. The resulting fragment was subcloned into the same sites of pBlueScript II KS (+). The resulting construct was further digested with NotI and PstI and combined with a NotI-PstI fragment containing the remaining c-terminal portion of a previously described MmCOP1 cDNA clone (GenBank accession number AF151110.1) [20], resulting in pKS-COP1. To generate pEGFP-COP1, pKS-COP1 was digested with ApaI and XbaI and the resulting fragment containing the MmCOP1 cDNA was subcloned into the same sites of the pEGFP I vector. The pFLAG-COP1 was generated by subcloning a SalI-XbaI fragment of pEGFP-COP1 containing the MmCOP1 cDNA into the same sites of pFLAG-CMV II vector. Direct sequencing was used to ensure in-frame insertions of these constructs. pEGFP-N280 and pFLAG-N280 were generat-ed by inserting a HindIII-StyI fragment (containing amino acid 70-350 of MmCOP1; StyI site was filled in by Klenow DNA polymerase) into pEGFP and pFLAG-CMV vectors digested by HindIII and SmaI. pEGFP-N346 was generated by inserting a HindIII-NdeI fragment (containing amino acid 70-416 of MmCOP1; NdeI site was filled in by Klenow DNA polymerase) into pEGFP vector digested by HindIII and SmaI. pEGFP-N280 was digested with EcoRI and BamHI, filled in with Klenow DNA polymerase on both ends, and re-ligated to generate pEGFP-N200. Point mutations were introduced by using the Quik-Change XL site-directed mutagenesis kit from Stratagene. pEGFP-N346 ∆RING was obtained by first introducing two XhoI sites into the construct, subsequently digesting with XhoI, and then re-ligating. The plasmids HA-Ub and His-Ub were described previously [22]. Full-length or partial cDNA sequences of the COP1 homologues in human, rat, cow, pig, Xenopus, chicken, and zebrafish were predicted by assembling the available homologous EST sequences from each species. A single EST clone (GenBank accession number BI461685) con-taining the full-length HsCOP1 open reading frame was obtained from ResGen and the coding region of HsCOP1 was sequenced. Eight EST clones from Rattus norvegicus (GenBank accession number AI071032, AW525130, BF560126, R47114, AW141780, BF285727, AW917150, and BF285717) were assembled into a near full-length rat COP1 cDNA (RnCOP1). Partial cow COP1 cDNA (BtCOP1) was predicted from six Bos taurus ESTs (Gen-Bank accession number AV608813, AW462260; BE478862, BF042569; AW326433 and BF440399). Only one EST clone (GenBank accession number AW787013) was found from the pig (Sus scrofa) to be homologous to the HsCOP1 3-prime end. A zebrafish EST clone (Gen-Bank accession number AI959013) was found homologous to the N terminal end of HsCOP1. Four EST clones from Xenopus laevis were identified through blast search against the HsCOP1 protein sequence. Of them, three were overlapping clones and were assembled into a fragment homologous to the N-terminal region of the HsCOP1 protein (GenBank accession number BE506274, BE679680, and BE026293) and one clone was homologous to the C-terminal end of the HsCOP1 protein (Gen-Bank accession number BG345699). One chicken EST clone (GenBank accession number BM487594) was found homologous to the N-terminal end of HsCOP1.

Genomic DNA and mRNA predications
The mRNA sequences of FrCOP1 and AgCOP1 are predicted from scaffold 5023 and scaffold CRA_x9P1GAV59LQ of the draft fugu and mosquito genomic sequences.

Cell culture and transfection
Cell line 293, COS7 and HeLa were grown in DMEM medium supplemented with 10% heat-inactivated fetal bo- The protein domains, N-terminal extension, RING finger domain, coiled coil domain, and WD-40 domains, are shaded in distinct patterns. The putative nuclear envelope localization sequence (NE), the nuclear localization signal (NLS), the nuclear export signal (NES), and the site important for speckle formation (Speckle) of MmCOP1 were determined in this study. The nuclear localization signal (NLS), the cytoplasmic localization signal (CLS), and the sequence important for speckle formation (Speckle) of AtCOP1 were depicted according to previously described [15,17].
vine serum (Life Technologies, Inc.). Mouse NIH3T3 cell line was grown in DMEM medium supplemented with 10% calf serum (Life Technologies, Inc.). For protein expression, cells were transfected with various plasmids using the LipofectMINE or LipofectMINE 2000 Reagent according to the manufacturer's instructions (Life Technologies, Inc.). Total amounts of plasmid DNA in individual transfection experiments were adjusted using empty vector plasmids. Transfected cells were cultured for at least 24 hours after transfection and used for immunoprecipitation, immunoblotting, and immunocytochemistry.

Immunoprecipitation and immunoblotting
Cells were lysed in lysis buffer (20 mM HEPES, pH 7.4, containing 100 mM NaCl, 5 mM MgCl 2 , 5 mM EDTA, 0.5% Triton X-100 and 1 mM PMSF) with 1 × proteasome inhibitor cocktail (Boehringer Mannhein). Immunoprecipitations from the transfected cell lysates were performed with anti-FLAG antibody coupled Agarose beads (Sigma) and then the beads were washed six times in the lysis buffer. Immunoprecipitates or total cell lysates were analyzed by immunoblot analysis. Anti-FLAG (M2) antibody was purchased from Sigma. The anti-HA anti-mAb414 monoclonal antibody was purchased from Covance. The anti-Lamin A/C antibody was purchased from Stratagene. Antibodies against the heat shock proteins were purchased from Transduction laboratories.

Purification of His-tagged protein
Cells were lysed in a denaturing lysis buffer (8 M urea, 0.1 M NaH 2 PO 4 and 0.01 M Tris•Cl, pH 8.0) and then briefly sonicated in full speed. After centrifugation at maximum speed for 15 minutes, the supernatant was incubated with Ni-NTA resin (Qiagen) for 1.5 hour at room temperature. The resin was washed six times in a denaturing wash buffer (8 M urea, 0.1 M NaH 2 PO 4 and 0.01 M Tris•Cl, pH 6.3) and then boiled for 10 minutes in equal volumes of 2 × SDS loading buffer.

Cell fractionation
Cell-fractionation was carried out mostly as previously described [23]. For isolation of nuclei and nuclear envelopes, HeLa cells from 10 15-cm culture dishes were washed twice with ice-cold PBS and scraped off in PBS. After centrifugation at 500 g for 5 minutes the cells were resuspended in 10 ml of STM 0.25 (50 mM Tris/HCl, pH 7.4, 0.25 M sucrose, 5 mM MgSO 4 , 2 mM DTT, 1× protease inhibitor cocktail, 1 mM PMSF). The following manipulations were performed either with ice-cold reagents or at 4°C unless noted. A solution of 10% Nonidet P-40 was added to a final concentration of 0.025%. The pellets were homogenized by 30-40 strokes in a glass-glass Dounce homogenizer. The homogenate was adjusted to 1.4 M sucrose by addition of an appropriate volume of STM 2.1 (50 mM Tris/HCl, pH 7.4, 2.1 M sucrose, 5 mM MgSO 4 , 2 mM DTT, 1× protease inhibitor cocktail, 1 mM PMSF). 2.5 ml of this suspension were transferred to each centrifuge tube and laid between a 100 ul STM 2.1 cushion and 200 ul of STM 0.8 (50 mM Tris/HCl, pH 7.4, 0.8 M sucrose, 5 mM MgSO 4 , 2 mM DTT, 1× protease inhibitor cocktail, 1 mM PMSF). The tubes were then filled with STM 025 and centrifuged at 100,000 g for 65 minutes in a fixed-angle rotor. The pellets containing the nuclei were washed twice with STM 0.25. Integrity and purity of the nuclei were judged by phase contrast microscopy.
The nuclei prepared above were suspended in 10 ml of ice-cold TP buffer (10 mM Tris/HCl, pH 8.0, 10 mM Na 2 HPO 4 , 1 mM PMSF) containing 0.3 mg/ml heparin and 36 u/ml of DNase I. The suspension was gently stirred on a magnetic stirrer for 60 minutes at 4°C and 15 minutes at room temperature. Nuclear envelopes were sedimented at 10,000 g for 30 minutes at 4°C and washed once with STM 0.25.
The homogenate of cells, prepared as described above, was centrifuged at 170 g for 15 minutes at 4°C. The supernatant was centrifuged at 300 g for 5 minutes at 4 °C and sedimented at 15,000 g for 10 minutes at 4°C. The pellet was washed with STM 0.25 and designated as crude mitochondria fraction. The supernatant was diluted with 1 volume of STM 0.0 (STM 0.25 without sucrose) and centrifuged at 105,000 g overnight at 4°C. The resulting supernatant and pellet yielded the cytosolic fraction and the microsomal fraction, respectively.

Immunofluorescence and triton-extraction
Cells were fixed with 4% paraformaldehyde, and indirectly immunostained with an anti-FLAG antibody (Sigma) and then a FITC-conjugated goat anti-mouse secondary antibody (Molecular Probes). DAPI (Sigma) was used for nuclear counterstaining. For certain experiments, cells were extracted with 1% Triton-X 100 in PBS for 10 minutes on ice and then washed three times in PBS before fixation. Zeiss Axiophot fluorescence microscope was used for observation. The images were taken using the Zeiss Ax-ioCam HRc digital imaging system.