您现在的位置:首页>外文期刊>ACM transactions on database systems

期刊信息

  • 期刊名称:

    ACM transactions on database systems

  • 中文名称: 数据库系统上的ACM事务
  • 刊频: 1.245
  • ISSN: 0362-5915
  • 出版社: -
  • 简介:
  • 排序:
  • 显示:
  • 每页:
全选(0
<1/20>
530条结果
  • 机译 快速数据的可扩展分析
    摘要:Today's streaming applications demand increasingly high event throughput rates and are often subject to strict latency constraints. To allow for more complex workloads, such as window-based aggregations, streaming systems need to support stateful event processing. This introduces new challenges for streaming engines as the state needs to be maintained in a consistent and durable manner and simultaneously accessed by complex queries for real-time analytics.Modern streaming systems, such as Apache Rink, do not allow for efficiently exposing the state to analytical queries. Thus, data engineers are forced to keep the state in external data stores, which significantly increases the latencies until events become visible to analytical queries. Proprietary solutions have been created to meet data freshness constraints. These solutions are expensive, error-prone, and difficult to maintain. Main-memory database systems, such as HyPer, achieve extremely low query response times while maintaining high update rates, which makes them well-suited for analytical streaming workloads. In this article, we explore extensions to database systems to match the performance and usability of streaming systems.
  • 机译 序列数据中的历史时刻发现
    摘要:Many emerging applications are based on finding interesting subsequences from sequence data. Finding \"prominent streaks,\" a set of the longest contiguous subsequences with values all above (or below) a certain threshold, from sequence data is one of that kind that receives much attention. Motivated from real applications, we observe that prominent streaks alone are not insightful enough but require the discovery of something we coined as \"historic moments\" as companions. In this article, we present an algorithm to efficiently compute historic moments from sequence data. The algorithm is incremental and space optimal, meaning that when facing new data arrival, it is able to efficiently refresh the results by keeping minimal information. Case studies show that historic moments can significantly improve the insights offered by prominent streaks alone. Furthermore, experiments show that our algorithm can outperform the baseline in both time and space.
  • 机译 嵌入式并行数据流语言的表示形式和优化
    摘要:Parallel dataflow engines such as Apache Hadoop, Apache Spark, and Apache Flink are an established alternative to relational databases for modern data analysis applications. A characteristic of these systems is a scalable programming model based on distributed collections and parallel transformations expressed by means of second-order functions such as map and reduce. Notable examples are Flink's DataSet and Spark's RDD programming abstractions. These programming models are realized as EDSLs-domain specific languages embedded in a general-purpose host language such as Java, Scala, or Python. This approach has several advantages over traditional external DSLs such as SQL or XQuery. First, syntactic constructs from the host language (e.g., anonymous functions syntax, value definitions, and fluent syntax via method chaining) can be reused in the EDSL. This eases the learning curve for developers already familiar with the host language. Second, it allows for seamless integration of library methods written in the host language via the function parameters passed to the parallel dataflow operators. This reduces the effort for developing analytics dataflows that go beyond pure SQL and require domain-specific logic.At the same time, however, state-of-the-art parallel dataflow EDSLs exhibit a number of shortcomings. First, one of the main advantages of an external DSL such as SQL -the high-level, declarative Select-From-Where syntax-is either lost completely or mimicked in a non-standard way. Second, execution aspects such as caching, join order, and partial aggregation have to be decided by the programmer. Optimizing them automatically is very difficult due to the limited program context available in the intermediate representation of the DSL.In this article, we argue that the limitations listed above are a side effect of the adopted type-based embedding approach. As a solution, we propose an alternative EDSL design based on quotations. We present a DSL embedded in Scala and discuss its compiler pipeline, intermediate representation, and some of the enabled optimizations. We promote the algebraic type of bags in union representation as a model for distributed collections and its associated structural recursion scheme and monad as a model for parallel collection processing. At the source code level, Scala's comprehension syntax over a bag monad can be used to encode Select-From-Where expressions in a standard way. At the intermediate representation level, maintaining comprehensions as a first-class citizen can be used to simplify the design and implementation of holistic dataflow optimizations that accommodate for nesting and control-flow. The proposed DSL design therefore reconciles the benefits of embedded parallel dataflow DSLs with the declarativity and optimization potential of external DSLs like SQL.
  • 机译 Wander Join和XDB:通过随机游走进行在线聚合
    摘要:Joins are expensive, and online aggregation over joins was proposed to mitigate the cost, which offers users a nice and flexible tradeoff between query efficiency and accuracy in a continuous, online fashion. However, the state-of-the-art approach, in both internal and external memory, is based on ripple join, which is still very expensive and even needs unrealistic assumptions (e.g., tuples in a table are stored in random order). This article proposes a new approach, the wander join algorithm, to the online aggregation problem by performing random walks over the underlying join graph. We also design an optimizer that chooses the optimal plan for conducting the random walks without having to collect any statistics a priori. Compared with ripple join, wander join is particularly efficient for equality joins involving multiple tables, but also supports theta-joins. Selection predicates and group-by clauses can be handled as well. To demonstrate the usefulness of wander join, we have designed and implemented XDB (approXimate DB) by integrating wander join into various systems including PostgreSQL, Spark, and a stand-alone plug-in version using PL/SQL. The design and implementation of XDB has demonstrated wander join's practicality in a full-fledged database system. Extensive experiments using the TPC-H benchmark have demonstrated the superior performance of wander join over ripple join.
  • 机译 图的依赖
    摘要:This article proposes a class of dependencies for graphs, referred to as graph entity dependencies (GEDs). A GED is defined as a combination of a graph pattern and an attribute dependency. In a uniform format, GEDs can express graph functional dependencies with constant literals to catch inconsistencies, and keys carrying id literals to identify entities (vertices) in a graph. We revise the chase for GEDs and prove its Church-Rosser property. We characterize GED satisfiability and implication, and establish the complexity of these problems and the validation problem for GEDs, in the presence and absence of constant literals and id literals. We also develop a sound, complete and independent axiom system for finite implication of GEDs. In addition, we extend GEDs with built-in predicates or disjunctions, to strike a balance between the expressive power and complexity. We settle the complexity of the satisfiability, implication, and validation problems for these extensions.
  • 机译 推断时间衰减的布隆过滤器的插入时间并优化错误惩罚
    摘要:Current Bloom Filters tend to ignore Bayesian priors as well as a great deal of useful information they hold, compromising the accuracy of their responses. Incorrect responses cause users to incur penalties that are both application- and item-specific, but current Bloom Filters are typically tuned only for static penalties. Such shortcomings are problematic for all Bloom Filter variants, but especially so for Time-decaying Bloom Filters, in which the memory of older items decays over time, causing both false positives and false negatives.We address these issues by introducing inferential filters, which integrate Bayesian priors and information latent in filters to make penalty-optimal, query-specific decisions. We also show how to properly infer insertion times in such filters. Our methods are general, but here we illustrate their application to inferential time-decaying filters to support novel query types and sliding window queries with dynamic error penalties.We present inferential versions of the Timing Bloom Filter and Generalized Bloom Filter. Our experiments on real and synthetic datasets show that our methods reduce penalties for incorrect responses to sliding-window queries in these filters by up to 70% when penalties are dynamic.
  • 机译 相似联接的输出最优大规模并行算法
    摘要:Parallel join algorithms have received much attention in recent years due to the rapid development of massively parallel systems such as MapReduce and Spark. In the database theory community, most efforts have been focused on studying worst-case optimal algorithms. However, the worst-case optimality of these join algorithms relies on the hard instances having very large output sizes. In the case of a two-relation join, the hard instance is just a Cartesian product, with an output size that is quadratic in the input size.In practice, however, the output size is usually much smaller. One recent parallel join algorithm by Beame et al. has achieved output-optimality (i.e., its cost is optimal in terms of both the input size and the output size), but their algorithm only works for a 2-relation equi-join and has some imperfections. In this article, we first improve their algorithm to true optimality. Then we design output-optimal algorithms for a large class of similarity joins. Finally, we present a lower bound, which essentially eliminates the possibility of having output-optimal algorithms for any join on more than two relations.
  • 机译 空间众包调查
    摘要:Widespread use of advanced mobile devices has led to the emergence of a new class of crowdsourcing called spatial crowdsourcing. Spatial crowdsourcing advances the potential of a crowd to perform tasks related to real-world scenarios involving physical locations, which were not feasible with conventional crowdsourcing methods. The main feature of spatial crowdsourcing is the presence of spatial tasks that require workers to be physically present at a particular location for task fulfillment. Research related to this new paradigm has gained momentum in recent years, necessitating a comprehensive survey to offer a bird's-eye view of the current state of spatial crowdsourcing literature. In this article, we discuss the spatial crowdsourcing infrastructure and identify the fundamental differences between spatial and conventional crowdsourcing. Furthermore, we provide a comprehensive view of the existing literature by introducing a taxonomy, elucidate the issues/challenges faced by different components of spatial crowdsourcing, and suggest potential research directions for the future.
  • 机译 从全面的实验调查到轻型整数压缩算法的基于成本的选择策略
    摘要:Lightweight integer compression algorithms are frequently applied in in-memory database systems to tackle the growing gap between processor speed and main memory bandwidth. In recent years, the vectorization of basic techniques such as delta coding and null suppression has considerably enlarged the corpus of available algorithms. As a result, today there is a large number of algorithms to choose from, while different algorithms are tailored to different data characteristics. However, a comparative evaluation of these algorithms with different data and hardware characteristics has never been sufficiently conducted in the literature. To close this gap, we conducted an exhaustive experimental survey by evaluating several state-of-the-art lightweight integer compression algorithms as well as cascades of basic techniques. We systematically investigated the influence of data as well as hardware properties on the performance and the compression rates. The evaluated algorithms are based on publicly available implementations as well as our own vectorized reimplementations. We summarize our experimental findings leading to several new insights and to the conclusion that there is no single-best algorithm. Moreover, in this article, we also introduce and evaluate a novel cost model for the selection of a suitable lightweight integer compression algorithm for a given dataset.
  • 机译 具有子序列约束的频繁序列挖掘的统一框架
    摘要:Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and hierarchy constraints. In this article, we show that many subsequence constraints-including and beyond those considered in the literature center dot -can be unified in a single framework. A unified treatment allows researchers to study jointly many types of subsequence constraints (instead of each one individually) and helps to improve usability of pattern mining systems for practitioners. In more detail, we propose a set of simple and intuitive \"pattern expressions\" to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under such general constraints. Our algorithms translate pattern expressions to succinct finite-state transducers, which we use as computational model, and simulate these transducers in a way suitable for frequent sequence mining. Our experimental study on real-world datasets indicates that our algorithms-although more general-are efficient and, when used for sequence mining with prior constraints studied in literature, competitive to (and in some cases superior to) state-of-the-art specialized methods.
  • 机译 具有示例元组的交互式映射规范
    摘要:While schema mapping specification is a cumbersome task for data curation specialists, it becomes unfeasible for non-expert users, who are unacquainted with the semantics and languages of the involved transformations.In this article, we present an interactive framework for schema mapping specification suited for non-expert users. The underlying key intuition is to leverage a few exemplar tuples to infer the underlying mappings and iterate the inference process via simple user interactions under the form of Boolean queries on the validity of the initial exemplar tuples. The approaches available so far are mainly assuming pairs of complete universal data examples, which can be solely provided by data curation experts, or are limited to poorly expressive mappings.We present a quasi-lattice-based exploration of the space of all possible mappings that satisfy arbitrary user exemplar tuples. Along the exploration, we challenge the user to retain the mappings that fit the user's requirements at best and to dynamically prune the exploration space, thus reducing the number of user interactions. We prove that after the refinement process, the obtained mappings are correct and complete. We present an extensive experimental analysis devoted to measure the feasibility of our interactive mapping strategies and the inherent quality of the obtained mappings.
  • 机译 分层工件系统的验证
    摘要:Data-driven workflows, of which IBM's Business Artifacts are a prime exponent, have been successfully deployed in practice, adopted in industrial standards, and have spawned a rich body of research in academia, focused primarily on static analysis. The present work represents a significant advance on the problem of artifact verification by considering a much richer and more realistic model than in previous work, incorporating core elements of IBM's successful Guard-Stage-Milestone model. In particular, the model features task hierarchy, concurrency, and richer artifact data. It also allows database key and foreign key dependencies, as well as arithmetic constraints. The results show decidability of verification and establish its complexity, making use of novel techniques including a hierarchy of Vector Addition Systems and a variant of quantifier elimination tailored to our context.
  • 机译 社论:对编辑委员会的更新
    摘要:
  • 机译 实用的深度私人搜索
    摘要:We consider a data owner that outsources its dataset to an untrusted server. The owner wishes to enable the server to answer range queries on a single attribute, without compromising the privacy of the data and the queries. There are several schemes on "practical" private range search (mainly in database venues) that attempt to strike a trade-off between efficiency and security. Nevertheless, these methods either lack provable security guarantees or permit unacceptable privacy leakages. In this article, we take an interdisciplinary approach, which combines the rigor of security formulations and proofs with efficient data management techniques. We construct a wide set of novel schemes with realistic security/performance trade-offs, adopting the notion of Searchable Symmetric Encryption (SSE), primarily proposed for keyword search. We reduce range search to multi-keyword search using range-covering techniques with tree-like indexes, and formalize the problem as Range Searchable Symmetric Encryption (RSSE). We demonstrate that, given any secure SSE scheme, the challenge boils down to (i) formulating leakages that arise from the index structure and (ii) minimizing false positives incurred by some schemes under heavy data skew. We also explain an important concept in the recent SSE bibliography, namely locality, and design generic and specialized ways to attribute locality to our RSSE schemes. Moreover, we are the first to devise secure schemes for answering range aggregate queries, such as range sums and range min/max. We analytically detail the superiority of our proposals over prior work and experimentally confirm their practicality.
  • 机译 估计未知未知数对汇总查询结果的影响
    摘要:It is common practice for data scientists to acquire and integrate disparate data sources to achieve higher quality results. But even with a perfectly cleaned and merged data set, two fundamental questions remain: (1) Is the integrated data set complete? and (2) What is the impact of any unknown (i.e., unobserved) data on query results?
  • 机译 用高级语言构建高效的查询引擎
    摘要:A Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain and extend.
  • 机译 使用视图重写查询
    摘要:A query Q in a language L has a bounded rewriting using a set of L-definable views if there exists a query Q' in L such that given any dataset D, Q(D) can be computed by Q' that accesses only cached views and a small fraction D-Q of D. We consider datasets D that satisfy a set of access constraints, which are a combination of simple cardinality constraints and associated indices, such that the size |D-Q| of D-Q and the time to identify D-Q are independent of |D|, no matter how big D is.
  • 机译 TriAL:RDF Triplestore的导航代数
    摘要:Navigational queries over RDF data are viewed as one of the main applications of graph query languages, and yet the standard model of graph databases-essentially labeled graphs-is different from the triples-based model of RDF. While encodings of RDF databases into graph data exist, we show that even the most natural ones are bound to lose some functionality when used in conjunction with graph query languages. The solution is to work directly with triples, but then many properties taken for granted in the graph database context (e.g., reachability) lose their natural meaning.
  • 机译 轻量级监视分布式流
    摘要:As data becomes dynamic, large, and distributed, there is increasing demand for what have become known as distributed stream algorithms. Since continuously collecting the data to a central server and processing it there is infeasible, a common approach is to define local conditions at the distributed nodes, such that-as long as they are maintained-some desirable global condition holds.
  • 机译 使用乘法实用程序功能的K后悔查询
    摘要:The k-regret query aims to return a size-k subset S of a database D such that, for any query user that selects a data object from this size-k subset S rather than from database D, her regret ratio is minimized. The regret ratio here is modeled by the relative difference in the optimality between the locally optimal object in S and the globally optimal object in D. The optimality of a data object in turn is modeled by a utility function of the query user. Unlike traditional top-k queries, the k-regret query does not minimize the regret ratio for a specific utility function. Instead, it considers a family of infinite utility functions F, and aims to find a size-k subset that minimizes the maximum regret ratio of any utility function in F.
  • 联系方式:010-58892860转803 (工作时间) 18141920177 (微信同号)
  • 客服邮箱:kefu@capm.ac.cn
  • 京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-1 六维联合信息科技(北京)有限公司©版权所有
  • 客服微信
  • 服务号
老濕免費福利體檢區坐車跟姐姐那個歐美三級片亞洲歐美AV中文日韓二區欧美性别类Ex188私欲小说十大免費最汙的直播4480yYy私人影院 Blog 日本 v 影_亚洲 v 影_亚洲 a 电_欧美a电 国产a影_欧美a电_日本v影_亚洲v影 国产久久亚洲美女久久-国产亚洲日韩欧美看国产 午夜国产免费视频亚洲-在线欧美 精品 第1页 免费观看三级片_免费国产Av_免费国产黄片 亚洲 自拍 校园 欧美 日韩-久青草国产在线视频 亚洲 另类 小说 国产精品-香蕉国产精品偷在线观看 国产亚洲精品免费视频-国产亚洲日韩欧美看国产 国产亚洲精品香蕉视频播放-国产免费三级a在线观看 欧美图***另类偷偷自拍-亚洲 中文 字幕 国产 综合 国产亚洲日韩欧美看国产-99国产这里有精品视频 欧美 亚洲 日韩 国产 综合-国产亚洲日韩欧洲一区 五月丁香六月综合欧美-成长在线视频免费观看 免费视频一区二区三区-国语自产拍在线视频中文 欧美免费全部免费观看-亚洲 日韩 中文 综合av 国产国产成年在线视频区-色天天综合色天天久久婷婷 国产在线视频播放社区-五月丁香六月综合缴情基地 欧美亚洲综合另类无码-日本成本人片无码免费视频 五月丁香六月综合欧美-日本成本人片视频免费 亚洲 欧美 国产 综合五月天-亚洲欧美日本国产高清 精品AV综合导航-日本在线看片免费视频 日本欧美日韩中文亚洲-日本三级无码中文字幕 在线观看免费视频日本高清-成年大片免费视频播放 不卡本日Av网站_日本av网站-夜色撩人手机免费观看 国产Av在线看的_韩国日本免费不卡在线_免费aV 岛国a视频在线观看-三分钟免费观看视频 亚洲伊人***网站-国产免费三级a在线观看 大香中文字幕伊人久热大-伊人成综合网伊人222- 免费A级毛片_中国A级毛片_午夜国产免费视频亚洲-在线欧美 精品 第1页_a片在线观看 三级a片_成 三级 观看_人 三级 写真人体 三级真人牲交 free欧美高清猪马牛 我和狗做了4年都没事 午夜国产免费视频亚洲-在线欧美 精品 第1页 bt种子搜索 同房姿势108种 使劲里面痒想要 年轻的母亲线2免费 午夜国产免费视频亚洲-在线欧美 精品 第1页 爸爸快点我坚持不住了 午夜国产免费视频亚洲-在线欧美 精品 第1页 熟透的岳 熟妇的荡欲 午夜国产免费视频亚洲-在线欧美 精品 第1页 老熟妇乱子伦视频 亚洲五月六月丁香缴情 e欧美性情一线免费http 把你干到疼得下不了床 女人床上活好是啥样的 床戏 床 戏 三个人在一个床上做了 精品国产自在现线拍 免费A级毛片 特级做人爱c级 国内偷拍在线精品 国产精品香蕉视频在线 国产精品高清视频免费 朋友的姐姐线观高清2 欧美高清videosedexohd 迷人的保姆5线观高清 天天看高清影视在线观看 一本道理高清在线播放 日本一道本高清二区 天天看免费高清影视 一区二区三区高清视频 日本一大免费高清 欧美高清vitios 高清一区高清二区 天天看高清影视在线WWW 特级aav毛片欧美免费观看 午夜国产免费视频亚洲-在线欧美 精品 第1页 天天看大片特色视频 免费A级毛片 特级做人爱c级 午夜国产免费视频亚洲-在线欧美 精品 第1页 中国A级毛片 A级人体片 香港三级 公憩关系小说 欧美三级片 秋霞理论在一l级 超级乱婬长篇小说 天堂v无码亚洲一本道 中文字幕乱码 电影在线观看 中文字幕乱码免费 中文亚洲无线码 日本无码不卡中文免费 日本一本道免费天码av 中文欧美无线码 国产av在在免费线观看 精品国产自在现线拍 亚洲AV国产AV手机在线 久久爱www免费人成 女人哪种下面最受欢迎 小妖精一天不做就难受呀 非会员试看一分钟做受小视频 女人的性承受极限 偷窥女教师 妈妈的朋友4线观高清 4攻一受同时做宿舍 我的妻子的姐姐2 电影 家里没人半夜就和姐姐 younggir第一次young 坐车跟姐姐那个 爸不不要了太满了流来了 能看到让你流水的小说 蜜汁在马背上流下来 喷个不停gif出处 喷潮白浆直流视频在线 女人喷潮完整视频 吹潮流的水能喝吗 色综合亚洲色综合吹潮 美国式禁忌 老汉开花苞 免费人做人爱的视频 午夜国产免费视频亚洲-在线欧美 精品 第1页 a级做爰片 午夜国产免费视频亚洲-在线欧美 精品 第1页 做爱网站 白小姐四肖必选一肖 younggir第一次young 宝贝我有点大你忍一下 国语自产一区第二页 不卡无在线一区二区三区观 日本一大免费高清 日本一本免费一二三区 午夜国产免费视频亚洲-在线欧美 精品 第1页 在线不卡日本v二区 w006.top 五个大佬跪在我面前叫妈 gif动态图视频第五十八期 亚洲五月六月丁香缴情 五月爱婷婷六月丁香色 综合欧美五月丁香五月 色婷亚洲五月 五月爱婷婷六月丁香色 十大免费最污的直播 口述在车里下面被添 公车上强行被灌满浓精 坐车跟姐姐那个 呵呵我要别停我要死了 么公的好大好硬好深好爽想要 使劲里面痒想要 一晚上要了小姑娘三次 想要嘛人家想啊你快点嘛 求你们不要了np 老公说想放在里面睡觉 好妈妈快点想死我了 500短篇超污多肉推荐 很肉到处做1v1青梅竹马 可以免费观看的av毛片 午夜国产免费视频亚洲-在线欧美 精品 第1页 日本毛片18禁免费 日本高清免费毛片大全 午夜国产免费视频亚洲-在线欧美 精品 第1页 午夜国产免费视频亚洲-在线欧美 精品 第1页 18岁末年禁止观看试看一分钟 美国式禁忌5一11集 我的绝色总裁未婚妻 绝味儿媳妇txt 顶级少妇 荡公乱妇 玩弄放荡人妇系列 japanesewiif0孕妇 熟妇大尺度人体艺 玩两个少妇女邻居 美妇乱人伦小说 67194成l人在线观看线路 公憩关系小说 私欲小说 杂乱小说1第403部分 老师不行我做不下去了小说 图片区 偷拍区 小说区 销魂美女图库 做爱动态图 131美女做爰图片 gif动态图出处第900期 他抬高她的腰撞到最深处 甜宠肉H双处