JKAS Archive

Journal of the Korean Astronomical Society - Vol. 51 , No. 5

[ Article ]
Journal of the Korean Astronomical Society - Vol. 51, No. 5, pp.129-142
Abbreviation: JKAS
ISSN: 1225-4614 (Print) 2288-890X (Online)
Print publication date 31 Oct 2018
Received 12 Mar 2018 Accepted 18 Sep 2018
DOI: https://doi.org/10.5303/JKAS.2018.51.5.129

NEW PHOTOMETRIC PIPELINE TO EXPLORE TEMPORAL AND SPATIAL VARIABILITY WITH KMTNET DEEP-SOUTH OBSERVATIONS
Seo-Won Chang1, 2, 3 ; Yong-Ik Byun3 ; Min-Su Shin4 ; Hahn Yi3 ; Myung-Jin Kim4 ; Hong-Kyu Moon4 ; Young-Jun Choi4 ; Sang-Mok Cha4, 5 ; Yongseok Lee4, 5
1Research School of Astronomy and Astrophysics, The Australian National University, Canberra, ACT 2611, Australia (seowon.chang@anu.edu.au)
2ARC Centre of Excellence for All-sky Astrophysics (CAASTRO)
3Department of Astronomy and University Observatory, Yonsei University, Seodaemun-gu, Seoul 03722, Korea
4Korea Astronomy and Space Science Institute, 776 Daedukdae-ro, Yuseong-gu, Daejeon 34055, Korea
5School of Space Research, Kyung Hee University, Giheung-gu, Yongin, Gyeonggi 17104, Korea

Correspondence to : S.-W. Chang


JKAS is published under Creative Commons license CC BY-SA 4.0.
Funding Information ▼

Abstract

The DEEP-South (the Deep Ecliptic Patrol of the Southern Sky) photometric census of small Solar System bodies produces massive time-series data of variable, transient or moving objects as a byproduct. To fully investigate unexplored variable phenomena, we present an application of multi-aperture photometry and FastBit indexing techniques for faster access to a portion of the DEEP-South year-one data. Our new pipeline is designed to perform automated point source detection, robust high-precision photometry and calibration of non-crowded fields which have overlap with previously surveyed areas. In this paper, we show some examples of catalog-based variability searches to find new variable stars and to recover targeted asteroids. We discover 21 new periodic variables with period ranging between 0.1 and 31 days, including four eclipsing binary systems (detached, over-contact, and ellipsoidal variables), one white dwarf/M dwarf pair candidate, and rotating variable stars. We also recover astrometry (< ±1–2 arcsec level accuracy) and photometry of two targeted near-earth asteroids, 2006 DZ169 and 1996 SK, along with the small- (~0.12 mag) and relatively large-amplitude (~0.5 mag) variations of their dominant rotational signals in R-band.


Keywords: methods: data analysis, techniques: photometric, stars: variables: general, asteroids: general

1. INTRODUCTION

The Deep Ecliptic Patrol of the Southern Sky (DEEPSouth: Moon et al. 2016) is a dedicated photometric study to physically characterize small bodies in our Solar System, as one of the secondary science projects of Korea Microlensing Telescope Network (KMTNet; Kim et al. 2016a). The DEEP-South employs a network of three identical 1.6-m telescopes located in Chile (CTIO), South Africa (SAAO) and Australia (SSO), allowing 24-hour monitoring of asteroids and comets. Light curves with BV RI-band colors have been acquired for more than two hundred Near-Earth Asteroids (NEAs) since late 2015. The major efforts to discover NEAs have been historically concentrated on planetary defense to ensure Earth’s safety from asteroid impacts. Several NEA discovery projects, such as the Catalina Sky Survey (CSS, Larson et al. 2003) and Panoramic Survey Telescope and Rapid Response System (PanSTARRS, Kaiser et al. 2002), have cataloged of NEAs more than 90 percent of the estimated population larger than 140 meters in regular scans of the sky. Comparing with other NEA surveys, the advantage of the DEEPSouth is the round-the-clock operation capability which is essential for either precision astrometry or time-series photometry with high temporal resolution.

As in the cases of other asteroid surveys (e.g., LONEOS: Bowell et al. 1995; LINEAR: Stokes et al. 2000; CSS: Larson et al. 2003), long-time baseline observations have the potential to advance our knowledge of variable and transient phenomena over different timescales from RR Lyraes (Miceli et al. 2008; Sesar et al. 2013; Torrealba et al. 2015) to periodic variables (Palaversa et al. 2013; Drake et al. 2014) to AGNs (Ruan et al. 2012). These time-domain data sets also increase our understanding of stellar structure and evolution: pulsating stars (e.g., δ Scuti, RR Lyrae, Cepheid, and Mira variables) are important probes of the internal structure in investigating different excitation mechanisms of oscillations in stars, eclipsing binary systems allow us to determine the masses of both stars in a direct way, and Type Ia supernovae are used as probes of cosmology. These new outcomes emphasize importance of accurate and homogeneous photometric measurements and calibrations in the entire survey data, in terms of further synergy with upcoming data releases by ongoing and future time-domain surveys.

Since the standard software packages of the DEEPSouth are designed for differential photometry of targeted moving objects (Yim et al. 2016), we have implemented an updated version of the source detection and time-series photometry pipeline to recover point sources which are not extracted in the original pipeline. First, the new pipeline is designed to conduct robust high-precision photometry and calibration of non-crowded fields with a varying point spread function (PSF). The PSF varies across images and changes with time due to focus, pointing jitter, optical distortion or atmospheric conditions, particularly in cases of wide field-of-view (FoV). Due to the large FoV (~4 deg2) of the KMTNet mosaic CCD images, we expect to see deformation of the image PSF as well as spatial variations in the pixel scale1. To mitigate this problem, we perform multi-aperture photometry in determining the optimal aperture for each point source at every epoch and correcting position-dependent variations in the PSF shape across the mosaic. In addition to this, the pipeline is designed to perform forced photometry, thus giving us the possibility to extract consistent measurement of explicit sources in every frame with respect to its reference (deep co-added) frame. This approach can reduce false detection rate at low flux levels.

Further, this paper addresses a multi-step calibration issue to tie all DEEP-South data coming from the three different telescopes to a consistent photometric system. As in the case of LINEAR and CSS surveys, standard stars are not available in every survey field for precision photometry, specially because of varying observation conditions. Due to the observing strategy of the program, however, we can determine relative calibration parameters (e.g., relative overall offset) by using multiply observed stars in any overlapping fields. Since some of these stars have been measured in other allsky multi-color surveys, we can transform our Johnson-Cousins filter system (only BV RI) to other photometric systems properly taking into account color terms. Similar calibration approaches have been successfully used in other asteroid surveys (Drake et al. 2009; Sesar et al. 2011).

Lastly, we have implemented a database management system for handling very large source catalogs. The source database is a key part of public data release enabling future scientific investigations. The data from most asteroid surveys are easily accessible in the form of an SQL-based relational database with an interactive web-based interface, containing all the epoch-based source photometry and metadata for more than tens to hundreds of million objects (e.g., LINEARdb: Sesar et al. 2011; Pan-STARRS1: Flewelling et al. 2016; ATLAS-VAR: Heinze et al. 2018). For convenience, all these systems provide internally linked tables by using a unique label for every object that was obtained by grouping individual source detections at a given matching radius. We can significantly increase speed of query execution for already linked objects in this approach. However, it does not allow us to investigate possible transients, variables or moving objects within the database itself in a flexible way. In order to accelerate data access and reduce query response time, we adopt an efficient data indexing technique called FastBit (Wu et al. 2009) that stores data in a column-oriented manner unlike traditional relational databases. In this paper, we focus on development and applications of catalog-based searches for variability in stars with fixed coordinates using this database. In the second paper of this series, we will present additional application example to search and identify moving objects with motion vectors.

This paper is organized as follows. In Section 2 we describe the DEEP-South survey and experimental time-series data. In Section 3 we outline the new photometry pipeline, calibration issues and the FastBit database system. In Section 4 we discuss a few example applications exploring temporal and spatial variability, especially periodic variable stars and targeted asteroids. We conclude with a view to utilize the pipeline for massive variability searches with a full set of the DEEPSouth survey data.


2. THE DEEP-SOUTH PHOTOMETRIC CENSUS FOR ASTEROIDS AND COMETS

The DEEP-South observations were made with three 1.6-m telescopes that have prime focus during the offseason of the Galactic bulge monitoring campaign of the KMTNet main survey. Survey observations comprise 135 distinct full nights per year between 2015 and 2019. The KMTNet camera consists of four 9k by 9k CCDs having a FoV of 2×2 degrees with a pixel scale of 0.4 arcsec per pixel (Kim et al. 2016a). The CCD chips are arranged with vertical (north-south direction) and horizontal (east-west direction) gaps of about 373 and 184 arcsec, respectively. The cameras have a 30 second readout time for the full mosaic with a readout noise of 10 electrons. The DEEP-South survey uses four standard Johnson-Cousins BV RI filters, but they used mostly R-band for time-series work. The quantum efficiency is 80–90% across the 4000–9000°A range, with a peak at R-band (~90%). About 50–100 science and associated calibration frames are produced in every nightly run (total 65–260 GB of data products). All the raw images are transferred to the KMTNet data center located at Daejeon in Korea for pre-processing.

The survey is designed to address a number of scientific goals that use five different observation modes (see Table 1 in Moon et al. 2016 for details). Opposition Census (OC) is the most frequently used mode for targeted photometry of NEAs in the regions on the sky around opposition. At that location, their color or brightness can be measured accurately. Each OC run consists of a series of exposures of more than two target fields that are visible in the opposition region. They normally employ exposure times short enough to avoid reducing the detection sensitivity for a moving object which may appear slightly streaked. From these OC observations, we can expect to obtain physical properties (e.g., rotation period and color indices), and to discover unknown moving objects. They also devote a small fraction of time to conduct both ecliptic survey and target-of-opportunity follow-up observations.

Table 1 
Summary of R-band time-series observations in the N02007-OC field
Site Date FWHMa
(arcsec)
Airmass # Mosaic
Pointings
CTIO 2015-08-14 2.35±0.28 1.23–1.37 13
2015-08-16 1.37±0.17 1.22–1.31 15
2015-08-18 1.56±0.25 1.22–1.48 33
2015-08-22 3.52±0.44 1.22–1.55 35
2015-08-24 1.30±0.24 1.22–1.61 28
2015-11-01 1.38±0.16 1.22–1.47 26
2015-11-02 1.92±0.30 1.22–1.53 26
2015-11-03 1.47±0.24 1.22–1.62 25
2015-11-05 1.54±0.20 1.22–1.59 23
SAAO 2015-08-08 2.05±0.42 1.26–1.49 26
2015-08-10 1.28±0.44 1.26–1.49 32
2015-08-12 2.72±0.21 1.28–1.38 9
2015-08-14 2.68±0.33 1.26–1.39 23
2015-08-24 2.91±0.38 1.26–1.78 31
2015-11-09 2.79±0.49 1.35–1.58 12
SSO 2015-08-08 1.97±0.26 1.24–1.39 26
2015-08-10 2.38±0.33 1.24–1.46 29
2015-08-18 1.60±0.20 1.24–1.58 35
2015-08-20 1.40±0.26 1.24–1.61 32
2015-08-22 2.23±0.22 1.24–1.36 17
2015-11-07 3.02±0.52 1.31–1.62 14
Total 510
a Median over images on that date.

2.1. Experimental Data Sets

For the purpose of developing a new photometric pipeline, we use the DEEP-South first-year data collected from late-July 2015 through mid-April 2016. Here we present results for the most frequently observed field, N02007-OC, centered at RA=00:24:00, DEC=+05:00:00 (Figure 1). The first letter in the field designation indicates a sign of Ecliptic latitude: N and S for + and -, respectively. The next two digits are Ecliptic latitude in degrees; the last three digits are Ecliptic longitude in degrees. This field has full overlap with the Sloan Digital Sky Survey (SDSS) which provides colors of objects detected in the ugriz system. The observation logs are summarized in Table 1. We conducted time-series observations of the N02007-OC field only in the R-filter to measure periodic variability induced by rotation of the two targeted asteroids (see Section 4.3). These moving objects were detected in 200-second exposures with sufficient signal-to-noise ratio down to R ~ 24.


Figure 1. 
Number of epochs per target field in the DEEP-South year-one data obtained from the three KMTNet telescopes. The values are color-coded by the bar on the right (from 1 to 510 visits). The solid line shows the Galactic plane in equatorial coordinates using the Aitoff projection. The location of the selected field, N02007-OC, is recognized by its darkest color in the center.

Since early science commissioning observations were undertaken, there have been many updates and changes in CCD readout electronics and software for telescope operation. As part of quality assurance of the year-one data, we check problematic images (e.g., poor telescope tracking, readout error, or bad seeing) separately for each of the four CCD chips in mosaic, i.e., on a chip-by-chip basis. We define the individual amplifier images as the basic unit of our photometry pipeline, called a frame. After excluding 24 bad frames both automatically as well as manually through the pipeline (see Section 3), 16296 frames were used for this study.


3. DATA PROCESSING PIPELINE

The raw mosaic images are pre-processed by the KMTNet data reduction pipeline, including overscan, bias and flat-fielding reduction. The KMTNet camera produces strong crosstalk signals in multi-segment amplifiers for bright or saturated sources. The level of these electronic ghosts varies from frame to frame, but these can be removed by measuring crosstalk coefficients among frames within a single CCD image (see Kim et al. 2016b for details).

The components of our processing pipeline are separated into three basic units based on the format of data. Here we briefly summarize main procedures applied to the pre-processed data.

  • 1. MEF unit (18432×18464 pixels): we obtain an accurate astrometric solution used SCAMP (Bertin 2006) after correcting for geometric field distortion that is well described by a polynomial with third order terms measured from the center of the mosaic. This distortion pattern appear to be stable over time at three different KMTNet sites. The astrometric solution was determined as follows: (i) we run SCAMP separately for each exposure, and then (ii) we run it again to find a final solution that is consistent across all input frames. This astrometric solution is verified against the 2MASS reference system (Skrutskie et al. 2006). The resultant internal and external astrometric accuracy on overlapping sources are less than 0.02±0.01 and 0.09±0.06 arcsec in both RA and Dec directions, respectively. In the future, we will use the second data release of Gaia as input to validate astrometric analysis.
  • 2. CCD unit (9216×9232 pixels): we remove cosmic-ray hits or hot-pixels using the parallelized version of the L.A.Cosmic algorithm (van Dokkum 2001)2.
  • 3. AMP unit (1152×9232 pixels): we adopt the strategy of treating all 32 frames separately in the photometry and calibration processes in order to reduce processing time.
3.1. Frame Catalog Generation

Source extraction is the first step in making a complete frame catalog for all point sources either in fixed or varying positions. We use the source detection algorithm implemented in the SExtractor (Bertin & Arnouts 1996) in which each source is regarded as as set of connected pixels that exceed threshold above the local background. In this work, all connected regions with more than 5 pixels above 2.5-σ of the local background were extracted. We also chose the following SExtractor parameters to optimize point source detection as well as to properly take into account blended sources: DETECT MINAREA=5, DETECT THRESH=2.5, DEBLEND NTHRESH=64, and DEBLEND MINCONT=0.0001. The average FWHM of each frame was used as a prior input for SEEING FWHM parameter. This process also produces several measurement characteristics such as FWHM, elongation, extraction flags, and star/galaxy classifier.

We utilize the multi-aperture photometry algorithm (Chang et al. 2015) in order to: (i) determine the optimal aperture for each object at each epoch, (ii) use an empirical aperture-growth-curve method for flux correction, and (iii) add a new tag that isolates peculiar situations where photometry returns improper measurements. For each frame, we measure photometry in a series of circular apertures (up to 12-pixel radius) without changing the sky annulus, and we use the last aperture (corresponding to 12-pixel radii) as the reference for aperture corrections. We refer the reader to Chang et al. 2015 for details of the photometric performance of the pipeline. The position-dependent PSF variation and pixel-scale variation across fields for given focus result in up to 10% changes in magnitude, depending on the selected aperture size for photometry. Figure 2 shows an example of position- and aperture-dependent magnitude offsets, with the reminder that the coverage of x-axis (~0.13 deg) is about eight times smaller than that of y-axis (~1.02 deg). This mayby expected to artificially enlarge the apparent size of the magnitude offset along the y-axis. The magnitude offsets vary significantly for the case of photometry performed with small-sized apertures. The magnitude differences between the reference aperture and the smaller apertures are well described by two-dimensional polynomial forms, so we can consider any spatial variation in the aperture corrections across a single frame.


Figure 2. 
Position- and aperture-dependent magnitude offsets (gray points) and that after the correction by a two-dimensional surface fitting (orange points) for an example frame. The dispersion of the magnitude offsets is much larger at smaller aperture sizes (top) than larger ones (bottom). The color maps show magnitude offsets in two-dimensional image frame.

Multiple visits to the same region of the sky enables us to achieve stable and uniform internal relative calibration in our instrumental photometric system. Bright, non-variable sources can be used as internal calibration stars, but it is necessary to check whether photometric quality of calibrators is sufficient and distributed homogeneously across the whole (x, y) plane in a given frame. For the latter, we calculate an additional quality-ensuring cut characterizing two-dimensional distribution of these internal calibrators by using two-dimensional Kolmogorov-Smirnov (K-S) test (e.g., Peacock 1983). The significance levels of flag statistic for the 2D K-S test can be summarized by the simple formula for the one-sample case (see also Equation 14.7.1 in Press et al. 1992). Using this probability PKS, we can figure out whether the source distribution is close to uniform across the frame. Figure 3 shows the spatial distributions of selected calibration stars for good and bad cases, in which relatively large value of probability (PKS > 0.15) indicates the presence of inhomogeneity.


Figure 3. 
Example of the spatial distribution of internal calibrators for good (gray points) and bad (red points) cases together with the calculated probability PKS. In the top panel, each frame location is indicated by frame numbers.

Following the suggestion of Ivezić et al. (2007), we also correct the spatial dependence of internal zeropoints around the photometric zeropoint separately for the 32 amplifiers in order to take into account atmospheric extinction gradients. We fit for linear gradients in both x and y coordinates simultaneously that can be expressed as:

m=fx,y=C0+C1x+C2y(1) 

where ∆m is the difference between reference and measured magnitudes for each star on every frame. The magnitude error is used as weighting factor for regression. C0 is a good diagnostic for discerning whether unrecognized temporal changes in atmospheric transparency have occurred during the observations. C0 should be close to zero when the quality of data is adequate. Figure 4 shows the histogram of C0 values for all frames along with cut-out images having three different quality cuts (good, intermediate, and bad). Finally, we apply the resulting zero-point surface to the instrumental magnitudes. A simple Boolean index is added to the database to be used as a photometric quality flag of our internal calibrations (see CALDEX in Appendix).


Figure 4. 
Histogram of C0 coefficients for all observed frames. The arrow indicates the range of photometric data quality observed under either perfectly (C0 ∼ 0) or partially nonphotometric conditions (-2 < C0 < -1). All insert images are presented as an example of different quality cuts.

3.2. Photometric Recalibration

To overcome the absence of observations used for photometric standardization, as mentioned in Introduction, we recalibrate our photometry to the SDSS Data Release 13 (SDSS DR13; Albareti et al. 2017) photometric catalog. The main reason being that there are no suitable photometric catalogs surveyed by other projects (e.g., no overlap with the first public data release of the Dark Energy Survey or poor photometric quality of the third data release of the Palomar Transient Factory). In addition to this, the well-studied SDSS color system is outstanding compared to other photometric systems (e.g., Pan-STARRS grizy filters or even for the Gaia broad bandpass) to aid in the study of point sources identified by our survey. Because only part of the DEEP-South field is covered by the SDSS DR13 photometric catalog, this approach is limited mainly by the declination limit. The recent SkyMapper Southern Survey provides the most comprehensive map of the Southern sky released to the public which contains well-calibrated magnitudes of point sources up to 18 (AB mag) in all uvgriz bands (Wolf et al. 2018), and thus it can serve as a calibration reference for the full data sets in the future.

From the SDSS DR13 database, we select suitable calibration stars with a series of clean flags (e.g., no edge, non-saturated, no cosmic-ray hits, primary object or no neighborhoods), magnitude errors below 0.05 in ri bands, and PSF magnitudes in the ranges 14≤ r ≤20 and 14≤ i ≤20. For the DEEP-South data, the mean magnitude error of the sources used becomes less than 0.05 mag in a given magnitude range. The transformation equation between the Johnson R and SDSS r system is simply derived as

R=r+kr-i+C(2) 

where k is the first-order color term and the possible zeropoint shift, C, is determined by iteratively rejecting outliers in the residual. The median absolute deviation of the residual of the fit for -0.3 < ri < 2.1 is 0.029. The color term k equals to -0.2837, which is similar to that found previously (k=-0.2936); please refer to the transformation equations derived by Lupton (2005).3 Figure 5 shows the transformation between SDSS ri and DEEP-South R magnitudes as defined by Equation (2). The scatter of magnitude difference between these two photometric systems is roughly constant with respect to magnitude. We see that the impact of a few photometric outliers is negligible at these color and magnitude levels.


Figure 5. 
Transformation between the DEEP-South R and SDSS r magnitudes as a function of ri. The light points are stars with low photometric error (less than 0.1 magnitudes) in griz bands. Top: the solid line shows the best fitting with Equation (2); Bottom: corresponding difference between transformed DEEP-South and SDSS magnitudes as a function of the SDSS r magnitude.

As a final step, we ingest all frame catalogs into the FastBit database (see Appendix for technical details) that provides a set of compressed bitmap indexes to quickly retrieve the list of objects in a given sky region, observational parameters of selected sources or timeseries data (e.g., light curves).


4. SEARCHING FOR TEMPORAL AND SPATIAL VARIABILITY
4.1. Light Curve Production

Performing a cone search is the easiest way to construct light curves for stationary sources detected at least twice, called groups. We first build a master catalog of about 46,000 groups merged by matching positions (with a match radius of 0.5 arcsec) across different epochs and amplifiers, and we also generate a catalog of transient sources detected only once (see Section 4.3). Our smaller matching radius can reduce the number of spurious matches, but it causes us to lose a few real matches for faint sources. We find that about 80 new groups (mostly R > 20) can be associated by positional matching if we double the cutoff radius, thus the group sample considered here may not severely limit our variability results in the next section. Within each group, we determine mean positions by combining all astrometric measurements after excluding flagged data points. Now we only require the mean sky position and a selected search radius to produce a light curve of each group. We choose a loose cut for light curve production (i.e., matching radius of 0.8 arcsec). The following example command executes a query on the database and returns associated measurements:

ibis -d DeepSouth-DB-q "SELECT MJD, MAG_MAP, MAGERR_MAPWHERE (SQRT(POWER(XWIN_RA-RA_group, 2)+ POWER(YWIN_DEC-DEC_group, 2))BETWEEN 0 AND SEARCH_RADIUS)AND FLAGS < 4ORDER BY MJD".

In this paper, we limit the sample to ~13,000 stars which have reliable color information in the SDSS DR13 database. Note that there is overlapping regions with the Pan-STARRS1 data, but it is similar in depth to the SDSS in r-band for point sources.

In order to remove systematic noise caused by atmospheric variations or instrumental effects, we apply the photometric detrending algorithm (see Kim et al. 2009 for details)4 to the light curves. This algorithm finds position- and time-dependent systematics using a clustering technique, and then corrects the highly affected light curves by removing the systematics from individual light curves. Figure 6 shows the overall dispersion of light curves before and after the detrending procedure without any outlier clipping. Here, we use the root-mean-square (RMS) amplitude of light curves as a proxy of our photometric precision. The RMS values of DEEP-South light curves decrease to 0.3% precision level at the bright end (R ≤ 16). Applying this technique to the raw light curves gives good results in recovering the true brightness variations over the full magnitude range from 14 to 22.5.


Figure 6. 
Robust RMS of light curves before (left) and after (right) the photometric detrending as a function of magnitude. The solid lines are smoothed averages using a bin width of 0.5 magnitude. The horizontal lines indicate a photometric precision of 1% and 0.3%, respectively.

Figure 7 shows the detrended, final light curves of known variable stars from the AAVSO International Variable Star Index (VSX: Watson et al. 2017). We identify three out of five known variable stars in the N02007-OC field; two of them are W Ursae Majoristype eclipsing variables (EW) with periods shorter than 0.5 days (CRTS J002358.0+052711 and CRTS J002328.2+040635). The other one is a typical abtype RR Lyrae star, [MRS2008] 006.577957+03.925690, with heliocentric distance ~15.84 kpc. The photometric quality of folded light curves is good enough to conduct variability analysis. One star shown in Figure 7, CRTS J002358.0+052711, is close to the saturation limit (≈14–15 mag depending on seeing conditions) in the N02007-OC field causing the relatively large scatter of the light curve over all phase bins. Two of the unmatched variables (ASAS J002137+0513.8 and CD Psc) are bright (<12 mag) and saturated in our observations.


Figure 7. 
Final light curves of three known variable stars folded by their period. Both variable types and periods are noted above each phased light curve.

4.2. Periodogram Analysis for Variability Search

One explicit advantage of longitudinal network observations with the KMTNet facility is to alleviate aliasing signals due to the common occurrence of data gaps in unevenly spaced time-series data. Figure 8 compares the results of period analysis by various periodogram tools (e.g., VARTOOLS: Hartman & Bakos 2016 and MS Period: Shin & Byun 2004) for the CTIO sample only and for the combined data from three telescopes. The spectral peaks in the periodogram are severely affected by daily aliasing signals for the CTIO sample, but combining the time series data from the three sites greatly reduces the false peaks due to aliasing.


Figure 8. 
Period finding results by various periodogram tools for the CTIO sample only (left) and for the combined sample from three telescopes (right). Each periodogram power is normalized by the peak amplitude for comparison purpose. The arrows indicate a true period of 0.25288 days for one known EW variable star shown in Figure 7.

We mainly use two different algorithms implemented in the VARTOOLS light curve analysis program to search for periodic variables with periods less than 40 days at an initial frequency resolution of 0.1/T (T is the time-span of the light curve): a Generalized Lomb-Scargle (LS) algorithm and an Analysis of Variance (AoV) algorithm using either phase binning or multi-harmonic model fitting (see Hartman & Bakos 2016 and references therein). To reduce erroneous results caused by outliers, we applied a typical sigma clipping (4-sigma) to each light curve before searching for semi-sinusoidal signals. Each algorithm gives a diagnostic parameter to test the significance of candidate periods. We choose different conservative selection cuts based on a false alarm probability (FAP) by visual inspection of example light curves at a range of values: e.g., LS FAP < -60, AoV FAP < -100, and AoVharm FAP < -140, respectively. Moreover, we remove the most obvious spurious detections having a value of peak frequency very close to 1 or 2 c/d (mostly around ~0.99 and ~2.02 days) due to the daily gaps even in the combined sample (see Table 1 for observation dates). Figure 9 shows the histograms of formal FAP values and initial periods from the LS period-search algorithm as an example after removing aliases. With these criteria, we check all the phase-folded light curves by eye and refined the initial estimate of the periods using fine grid searches (<0.01/T) around the highest peaks. Table 3 lists 21 new periodic variable stars in the period range of 0.1– 31 days. From most stars, the difference, ∆P, in the periods computed by using the LS and AoV algorithms is less than 0.5%. Table 2 summarizes the steps we used to identify a sample of periodic variable stars from our database for the N02007-OC field.

Table 2 
Periodic variable selection criteria
Selection criterion Number of objects
All 2.5-σ sources 15,629,598a
All possible groups (<0.5 arcsec) 46,054
SDSS stars with gri colors 13,261
Variable candidates by FAP cuts 309
Rejection of aliased signals 57
Final periodic variables 24
a Number of individual photometric measurements.

Table 3 
21 New Periodic Variable Stars in the N02007-OC field
VarID StarID
(J2000 coordinates)
R
(mag)
P
(mag)
AaR
(mag)
u
(mag)
g
(mag)
r
(mag)
i
(mag)
z
(mag)
Note
V1 SDSS J002013.16+052242.2 16.24 2.1038 0.06 19.66 17.35 16.21 15.71 15.35
V2 SDSS J002031.74+055112.6 16.33 1.1568 0.04 20.26 17.72 16.38 15.78 15.43
V3 SDSS J002051.80+042641.3 17.50 0.3076 0.13 19.59 18.10 17.48 17.25 17.15 ELV
V4 SDSS J002111.05+054856.4 18.60 1.5046 0.04 22.12 20.04 18.61 17.23 16.46
V5 SDSS J002125.21+040357.0 16.79 31.5771 0.03 21.12 18.20 16.80 16.05 15.62
V6 SDSS J002141.37+051626.1 16.76 22.4097 0.01 20.83 18.14 16.71 15.89 15.47
V7 SDSS J002201.40+053612.5 18.35 0.3808 0.44 21.18 19.28 18.41 17.93 17.60 EB
V8 SDSS J002215.91+055338.9 20.19 0.2489 0.32 21.56 20.68 20.26 20.19 20.09 OC
V9 SDSS J002216.31+040439.5 19.59 23.4399 0.12 23.36 21.13 19.69 18.01 17.05
V10 SDSS J002247.39+045147.8 18.11 2.9532 >0.49 19.19 18.04 18.09 18.17 18.27 EB
V11 SDSS J002249.03+055356.6 18.33 16.0571 0.04 22.39 19.82 18.34 17.33 16.79
V12 SDSS J002314.27+042543.6 19.00 0.2673 0.16 23.05 20.33 18.97 17.58 16.75
V13 SDSS J002351.64+054553.7 15.89 21.7415 0.02 19.84 17.10 15.88 15.41 15.12
V14 SDSS J002425.52+044130.3 17.95 1.6484 0.10 20.22 18.64 17.89 17.54 17.38
V15 SDSS J002435.70+051914.1 16.64 7.5176 0.03 19.04 17.30 16.59 16.30 16.13
V16 SDSS J002516.12+053728.4 15.09 11.9412 0.02 18.82 16.14 15.07 14.68 14.46
V17 SDSS J002549.95+041608.6 20.60 0.1234 0.10 23.56 21.98 20.54 19.66 19.13
V18 SDSS J002610.07+043147.2 18.15 2.8856 0.02 22.81 19.64 18.12 16.73 15.95
V19 SDSS J002701.84+040710.5 17.69 0.3570 0.03 18.46 18.02 17.66 16.78 16.15 WDMD
V20 SDSS J002704.93+040519.4 15.32 20.4821 0.02 18.45 16.20 15.38 15.08 14.89
V21 SDSS J002733.09+043701.9 17.70 3.1778 0.04 21.41 19.15 17.66 16.57 15.99
a Peak-to-through R-band variability amplitude of sinusoidal function fitted to phase-folded light curve.


Figure 9. 
Histograms of the FAP values (top) and initial periods (bottom) from the LS algorithm after filtering alias signals. Our conservative selection criterion is LS FAP < -60 indicated by dashed line. Example phased light curves (magnitude versus phase) are also shown for guiding purpose.

Figure 10 shows the locations of newly discovered variable stars in a color-color diagram, as well as all other stars with magnitude error less than 0.2 in gri bands. The effect of SDSS color errors induced by variability is negligible for our variable stars (error bars are smaller than the symbols). Most of them are close to the locus of stellar main sequence described by the SDSS gri colors (Covey et al. 2007, see dashed line in the same figure). We have only limited information to characterize their variability nature as their spectra are currently not available yet. However, the lightcurve properties and source colors suggest that these are mainly due to spot-induced rotational modulation. One color outlier that lies far outside the locus is a candidate white dwarf/M dwarf pair (WDMD: SDSS J002701.84+040710.5; V19), which is a likely short-period binary showing reflection from a close companion at the orbital period. The observed color is also consistent with those expected from genuine WDMD binaries (e.g., see Figure 2 of Rebassa-Mansergas et al. 2013). As shown in Figure 10, two of them show eclipsing binary (EB) signatures in their phase-folded light curves. The epochs of the primary and secondary eclipses are clearly seen over more than one complete cycle. SDSS J002201.40+053612.5 (V7) is identified as semi-detached EB with a 0.3808 d orbital period. SDSS J002247.39+045147.8 (V10) is an Algoltype EB with an orbital period of 2.953 days, having the detached component. We also show phase-folded light curves of all other discovered variables in Figure 11. SDSS J002215.91+055338.9 (V8) is a overcontact (OC) binary which is recognized by a continuous variation between eclipses such as W UMa-type systems. Lastly, SDSS J002051.80+042641.3 (V3) may be an ellipsoidal variable (ELV) showing double maxima and minima per orbital period due to tidal distortion.


Figure 10. 
Location of 21 new periodic variable stars (black dots) and all other stars (gray dots) in color-color diagram. The dashed line indicates the main-sequence stellar locus with solar metallicity (Covey et al. 2007). The locations of two eclipsing binaries and one WDMD candidate are indicated by circles and square with overplotted black dots, respectively. The bottom panels show their phase-folded light curves.


Figure 11. 
Phase-folded light curves of other 18 periodic variable stars. The small error bars indicate the interquartile range of the measurement errors for each light curve.

4.3. Recovery of Moving Objects

Another application is to retrieve trajectories of targeted asteroids and their light curves in the catalog of transient sources, i.e., those detected only once. It is not difficult to recover orbital paths of known asteroids on every frame by comparing with an ephemeris from the Minor Planet Center (MPC)5, the official organization that is responsible for the identification, designation and orbit computation for all types of moving objects (minor planets, comets and outer irregular natural satellites of the major planets). The names of the two objects we observed in this field are 2006 DZ169 and 1996 SK. The former asteroid was considered as a potential target for space missions due to its low rendezvous velocity (e.g., Mueller et al. 2011), while the latter is a potentially hazardous NEA (e.g., Lin et al. 2014). Their physical and dynamical properties are well summarized in the NEAs database, updated continuously by European Asteroid Research Node. There is no ambiguity in their measured rotation periods with a reliability code of 3 (i.e., secure result with full lightcurve coverage) based on the definition of Lagerkvist et al. (1989). Therefore, our observations allow us to confirm the results of variability analysis reported in previous works6.

We compute celestial coordinates of these NEAs at a given epoch based on the orbital data provided by the MPC database, and then we cross-identify single-epoch sources in the transient catalog with those precomputed positions. In this way, we recover 98% of the spatial locations of the NEAs after excluding flagged data. The coordinate difference between the MPC and our astrometric solution is less than ±1–2 arcsec in both RA and Dec direction. Figure 12 shows the projected path of each orbit (solid lines) over the observing span, as well as the single-epoch sources that are detected in this region of the sky. In order to highlight the possibility of searching for untargeted moving objects with moving speed similar to targeted ones, we only include all known moving objects within a given FoV (gray linked steaks or points) identified by the Virtual Observatory SkyBoT tool (Berthier et al. 2006).


Figure 12. 
Projected orbital paths (left) and light curves (right) of 2006 DZ169 and 1996 SK. Left: The background gray dots are single-epoch sources in the transient catalog, showing only known moving objects listed in the VO SkyBoT database. Right: The arrows indicate a subset of light curves zoomed in the inserted plots which show rapid (less than a few hours) brightness variations due to rotation.

Lastly, we present light curves of the two targeted NEAs which are representative in quality for other asteroids in the transient catalog because our experimental data were obtained in conditions optimal to measure their periods (see Figure 12). These light curves are a superposition of two components with different timescales; a long-term change in brightness with increasing or decreasing solar phase angle and a rapid periodic modulations in brightness due to rotation of the asteroid (see zoomed-in views of both light curves). After removing the long-term light variation and outliers, we measure rotation periods and full amplitudes for the NEAs using the periodogram tools. Their rotation periods are found to be similar to those in the NEAs database. The rotation period of 1996 SK is about 4.644 hours (Pnew) with a highly reliable result with full light curve coverage (c.f.,Pknown=4.645 hrs), while that of 2006 DZ169 is about Pknown=4.682 hours that is not clearly seen in the data. 1996 SK has a large amplitude of ~0.5 mag in R-band, while 2006 DZ169 exhibits a relatively low amplitude variation (~0.12 mag).


5. SUMMARY

We have described in detail the main algorithms of our new reduction pipeline to explore the temporal and spatial variability with DEEP-South observations. Our multi-aperture photometry technique produces a homogeneous set of photometric measurements for all point sources in non-crowded fields observed by a distributed network of three different telescopes. This is important as it will allow us to study the variability properties of targeted or untargeted objects in a large database of photometric time series without extra computational efforts. Taking into account spatial dependence of PSF variations, zeropoint variations, and systematic effects, we find the RMS scatter of the light curves for point sources to be reached down to 0.3% level at the bright end (R ≤ 16) and ~10% level at the faint end (R ~ 22) in the longest observing records of the DEEP-South year-one data. We also emphasize the use of the indexed database tools, such as FastBit, in both minimizing required storage space and fast query performance to produce large sets of light curves. In the future, we will apply this updated pipeline to process entire imaging data taken by DEEP-South survey to further explore stellar variability.

As applications of the photometric database, we first presented light curves of known variable stars to illustrate the overall quality of the photometric calibration. We find 21 new periodic variable stars with period between 0.1 and 31 days, including four EBs and one WDMD candidate which are evident by either (i) the shape of phase-folded light curve and/or (ii) colors to figure out variability classes. Since we limit the samples to sources only with the SDSS colors, we could miss either non-periodic variables or variable candidates without SDSS colors. To fully investigate properties of all remaining sources, we will attempt to use an infinite gaussian mixture model for detecting variable objects and suppressing false positives efficiently (e.g., Shin et al. 2009, 2012).

Additionally, we show the potential of database applications to retrieve the projected orbital paths and light curves of two targeted NEAs (2006 DZ169 and 1996 SK) in the experimental data. We expect to see more known asteroids with moving speed similar to targeted ones, and we will present results of exploring this possibility in the second paper of the series. All light curves of variable objects in this experiment can be accessed through the web site: http://stardb.yonsei.ac.kr/.



Acknowledgments

We thank two anonymous referees for their constructive comments that improved this paper. This research was supported by the Korea Astronomy and Space Science Institute (KASI) under the R&D program (Project No.2015-1-320-18) supervised by the Ministry of Science, ICT and Future Planning. S.-W. C. acknowledges the support from KASI – Yonsei research collaboration program for the frontiers of astronomy and space science (2016-1-843-00). Parts of this research also were conducted by the Australian Research Council Centre of Excellence for All-sky Astrophysics (CAASTRO), through project number CE110001020. This research has made use of the KMTNet system operated by the KASI and the data were obtained at three host sites of CTIO in Chile, SAAO in South Africa, and SSO in Australia.


References
1. Albaret, F. D., Allende Priet, C., Almeid, A., et al, (2017), The 13th Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the SDSS-IV Survey Mapping Nearby Galaxies at Apache Point Observatory, ApJS, 233, p25.
2. Bertin, E., (2006), Automatic Astrometric and Photometric Calibration with SCAMP, ASPC, 351, p112.
3. Bertin, E., & Arnouts, S., (1996), SExtractor: Software for Source Extraction, A&AS, 117, p393.
4. Berthier, J., Vachier, F., Thuillot, W., et al, (2006), SkyBoT, A New VO Service to Identify Solar System Objects, ASPC, 351, p367.
5. Bowell, E., Koehn, B. W., Howell, S. B., et al, (1995), The Lowell Observatory Near-Earth-Object Search: A Progress Report, DPS, 27, p01-10.
6. Chang, S.-W., Byun, Y.-I., & Hartman, J. D., (2015), A New Method for Robust High-Precision Time-Series Photometry from Well-Sampled Images: Application to Archival MMT/Megacam Observations of the Open Cluster M37, AJ, 149, p135.
7. Chou, J., Howison, M., Austin, B., et al, (2011), Parallel Index and Query for Large Scale Data Analysis, In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), ACM, New York, USA, 30, p11.
8. Covey, K. R., Ivezić, Ž., Schlegel, D., et al, (2007), Stellar SEDs from 0.3 to 2.5 μm: Tracing the Stellar Locus and Searching for Color Outliers in the SDSS and 2MASS, AJ, 134, p2398.
9. Drake, A. J., Djorgovski, S. G., Mahabal, A., et al, (2009), First Results from the Catalina Real-Time Transient Survey, ApJ, 696, p870.
10. Drake, A. J., Graham, M. J., Djorgovski, S. G., et al, (2014), The Catalina Surveys Periodic Variable Star Catalog, ApJS, 213, p9.
11. Flewelling, H. A., Magnier, E. A., Chambers, K. C., et al, (2016), The Pan-STARRS1 Database and Data Products, arXiv:1612.05243.
12. Ivezić, Ž., Smith, J. A., Miknaitis, G., et al, (2007), Sloan Digital Sky Survey Standard Star Catalog for Stripe 82: The Dawn of Industrial 1% Optical Photometry, AJ, 134, p973.
13. Hartman, J. D., & Bakos, G. A., (2016), VARTOOLS: A Program for Analyzing Astronomical Time-Series Data, A&C, 17, p1.
14. Heinze, A. N., Tonry, J. L., Denneau, L., et al, (2018), A First Catalog of Variable Stars Measured by the Asteroid Terrestrial-Impact Last Alert System (ATLAS), arXiv:1804.02132.
15. Kaiser, N., Aussel, H., Burke, B. E., &, (2002), PanSTARRS: A Large Synoptic Survey Telescope Array, SPIE, 4836, p154.
16. Kim, D.-W., Protopapas, P., Alcock, C., et al, (2009), Detrending Time Series for Astronomical Variability Surveys, MNRAS, 397, p558.
17. Kim, S.-L., Lee, C.-U., Park, B.-G., et al, (2016a), KMTNet: A Network of 1.6 m Wide-Field Optical Telescopes Installed at Three Southern Observatories, JKAS, 49, p37.
18. Kim, S.-L., Cha, S.-M., Lee, C.-U., et al, (2016b), Crosstalk Correction of the KMTNet Mosaic CCD Image, PKAS, 31, p35.
19. Lagerkvist, C.-I., Harris, A. W., & Zappala, V., (1989), Asteroid Lightcurve Parameters, Asteroids II, p1162.
20. Larson, S., Beshore, E., Hill, R., et al, (2003), The CSS and SSS NEO Surveys, DPS, 35, p3604.
21. Lin, C.-H., Ip, W.-H., Lin, Z.-Y., et al, (2014), Detection of Large Color Variation in the Potentially Hazardous Asteroid (297274) 1996 SK, RAA, 14, p311.
22. Liu, Y.-B., Wang, F., Ji, K.-F., et al, (2014), NVST Data Archiving System Based on FastBit NoSQL Database, JKAS, 47, p115.
23. Miceli, A., Rest, A., Stubbs, C. W., &, (2008), Evidence for Distinct Components of the Galactic Stellar Halo from 838 RR Lyrae Stars Discovered in the LONEOS-I Survey, ApJ, 678, p865.
24. Moon, H.-K., Kim, M.-J., Yim, H.-S., et al, (2016), DEEPSouth: Network Construction, Test Runs and Early Results, Asteroids: New Observations, New Models, 318, p306.
25. Mueller, M., Delbo’, M., Hora, J. L., et al, (2011), ExploreNEOs. III. Physical Characterization of 65 Potential Spacecraft Target Asteroids, AJ, 141, p109.
26. Palaversa, L., Ivezić, Ž., Eyer, L., et al, (2013), Exploring the Variable Sky with LINEAR. III. Classification of Periodic Light Curves, AJ, 146, p101.
27. Peacock, J. A., (1983), Two-Dimensional Goodness-of-Fit Testing in Astronomy, MNRAS, 202, p615.
28. Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P., (1992), Numerical Recipe, Cambridge, Cambridge Univ. Press.
29. Rebassa-Mansergas, A., Agurto-Gangas, C., Schreiber, M. R., et al, (2013), White Dwarf Main-Sequence Binaries from SDSS DR 8: Unveiling the Cool White Dwarf Population, MNRAS, 433, p3398.
30. Ruan, J. J., Anderson, S. F., MacLeod, C. L., et al, (2012), Characterizing the Optical Variability of Bright Blazars: Variability-Based Selection of Fermi Active Galactic Nuclei, ApJ, 760, p51.
31. Sesar, B., Stuart, J. S., Ivezić, Ž., &, (2011), Exploring the Variable Sky with LINEAR. I. Photometric Recalibration with the Sloan Digital Sky Survey, AJ, 142, p190.
32. Sesar, B., Ivezić, Ž., Stuart, J. S., et al, (2013), Exploring the Variable Sky with LINEAR. II. Halo Structure and Substructure Traced by RR Lyrae Stars to 30 kpc, AJ, 146, p21.
33. Shin, M.-S., & Byun, Y.-I., (2004), Efficient Period Search for Time Series Photometry, JKAS, 37, p79.
34. Shin, M.-S., Sekora, M., & Byun, Y.-I., (2009), Detecting Variability in Massive Astronomical Time Series Data I. Application of an Infinite Gaussian MixtureModel, MNRAS, 400, p1897.
35. Shin, M.-S., Yi, H., Kim, D.-W., et al, (2012), Detecting Variability in Massive Astronomical Time-Series Data. II. Variable Candidates in the Northern Sky Variability Survey, AJ, 143, p65.
36. Skrutskie, M. F., Cutri, R. M., Stiening, R., et al, (2006), The Two Micron All Sky Survey (2MASS), AJ, 131, p1163.
37. Stokes, G. H., Evans, J. B., Viggh, H. E. M., et al, (2000), Lincoln Near-Earth Asteroid Program (LINEAR), Icar, 148, p21.
38. Torrealba, G., Catelan, M., Drake, A. J., et al, (2015), Discovery of ∼9000 new RR Lyrae in the Southern Catalina Surveys, MNRAS, 446, p2251.
39. van Dokkum, P. G., (2001), Cosmic-Ray Rejection by Laplacian Edge Detection, PASP, 113, p1420.
40. Watson, C., Henden, A. A., & Price, A., (2017), VizieR Online Data Catalog: AAVSO International Variable Star Index VSX, 1.
41. Wolf, C., Onken, C. A., Luvaul, L. C., et al, (2018), SkyMapper Southern Survey: First Data Release (DR1), PASA, 35, p10.
42. Wu, K., Ahern, S., Bethel, E. W., et al, (2009), FastBit: Interactively Searching Massive Data, JPhCS, 180, p012053.
43. Yim, H.-S., Kim, M.-J., Bae, Y.-H., et al, (2016), DEEPSouth: Automated Observation Scheduling, Data Reduction and Analysis Software Subsystem, Asteroids: New Observations, New Models, 318, p311.

APPENDIX A. DATABASE SYSTEM

The key technology of the FastBit is a bitmap-based indexing scheme that stores a list of row identifiers for each value of attribute column as sequences of bits (i.e., 0 or 1). It also reduces the size of index file significantly. Due to query-intensive nature of our experiments, this kind of indexing helps improve query performance. Liu et al. (2014) reported good performance in index creation, query and storage space of this algorithm by comparing with a typical relational database (e.g., MySQL or PostgreSQL). The reader is referred to Liu et al. (2014) for details of quantitative comparison between both database systems using high-resolution solar observations.

We check query response time for two cases: (i) a single one-thread application for different query options and (ii) multi-thread application for multiple query processing. In the former case, we compare performance of counting the number of hits returned without any options to that with only SELECT clause and that with an output option, respectively. For sizes up to 100 million rows, typical queries take less than one second on an Intel Xeon Processor with 2.66GHz clock speed (see top panel of Figure 4). FastBit database operation is limited by I/O performance that is dominated by data-accesses to disk than computations performed by the CPUs. In the multi-thread case, we made either 1000 or 10000 queries to be contained in two files where each SELECT statement is randomly defined. We can see improvement in speed by creating multiple query processes and executing these multiple queries in parallel, as shown in the bottom of Figure 13. We briefly present an example application in an astronomical context; therefore, we request the reader refer to Chou et al. (2011) for more details about FastBit-based parallel query processing.


Figure 13. 
Assessment of query execution performance using the FastBit database system for both single-thread (top) and multi-thread configurations (bottom). The dashed lines indicate the slope of perfect scaling with respect to single-node performance for reference.

We use command-line tools provided by FastBit for converting data format and building indexes. We hope this will make it easier for users of their own data sets to find relevant content. The following example command converts frame catalogs to the metadata tables in a raw binary form:

ardea -d DeepSouth-DB (directory-to-write-data)
-m "LOCAL_ID:uint, XWIN_IMAGE:double,
YWIN_IMAGE:double, XWIN_RA:double,
YWIN_DEC:double, MJD:double,
MAG_MAP:double, MAGERR_MAP:double,
FWHM_IMAGE:float, ELONGATION:float,
CLASS_STAR:float, FLAGS:short,
CALDEX:int, AMPS:int"
-t frame-catalog (text-file-to-read)
-v 5 (verbose_level)
-b ‘’(break/delimiters-in-text-data).

The below example command builds new indexes with basic bitmap options for binning, encoding, and compressing processes:

ibis -d DeepSouth-DB (target database)
-b "<binning nbins=B/><encoding equality/>"
-z (append-existing-indexes)
-v 5 (verbose_level).

We choose the binning option to reduce the number of bitmaps for attributes with (very) high cardinalities. The strategy of binning was discussed in detail in Section 2.4 of Wu et al. (2009), where it is shown that binning can improve the query response time based on order-preserving bin-based clustering. The maximum size of the index is primarily determined by three parameters: the number of rows N, the number of bins B, and the bitmap encoding. Under the equality encoding condition, our test database contains 15,629,598 rows each with 17 attribute columns, and its indexed size is about 2.4 gigabytes for small B (< 100).

Unfortunately, the use of FastBit scheme has intrinsic limitations as it is just a stand-alone data processing tool. It is not a database management system, so most SQL commands are not supported. It imposes a limit on the number of rows that can be stored in indexed tables (no more than 2 billion rows). Because it is also not well-optimized to run parallel processing in a cloud-like environment, we are now testing open source database systems, such as Redis7 and GeoMesa8, to overcome current difficulties.