ORDER BY - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

5. ClickHouse at Ximalaya for Shanghai Meetup 2019 PDF

groupArray(timestamp) as timestamps, arrayEnumerate(pages) as index FROM (SELECT * FROM client_log_all ORDER BY timestamp) GROUP BY user �� SELECT user, groupArray(page) as pages except Order] arrayFilter((i, p) -> (pages[i] = 'HomePage' AND pages[i+1]= 'Detail' AND pages[i+2]!='Order'), index, pages) as level_2, // In pages array, find a subarray of [HomePage, Detail, Order] arrayFilter((i (pages[i] = 'HomePage' AND pages[i+1]= 'Detail' AND pages[i+2]='Order'), index, pages) as level_3 FROM (SELECT * FROM client_log_all ORDER BY timestamp) GROUP BY user • ��

0 码力 | 28 页 | 6.87 MB | 1 年前
3
ClickHouse in Production

SumShows, countIf(CounterType='Click') as SumClicks, BannerID FROM EventLogHDFS GROUP BY BannerID ORDER BY SumClicks desc LIMIT 3; 52 / 97 In ClickHouse: Most Clicked Banner SELECT countIf(CounterType='Show') SumShows, countIf(CounterType='Click') as SumClicks, BannerID FROM EventLogHDFS GROUP BY BannerID ORDER BY SumClicks desc LIMIT 3; ┌─SumShows─┬─SumClicks─┬───BannerID─┐ │ 6485 │ 1015 │ 6251269090 │ │ 97 In ClickHouse: Local Log Copy CREATE TABLE EventLogLocal AS EventLogHDFS ENGINE = MergeTree() ORDER BY BannerID; Ok. INSERT INTO EventLogLocal SELECT * FROM EventLogHDFS; Ok. 0 rows in set. Elapsed:

0 码力 | 100 页 | 6.86 MB | 1 年前
3
8. Continue to use ClickHouse as TSDB

`HeartRate` UInt8, `Humidity` Float32, ... ) ENGINE = MergeTree() PARTITION BY toYYYYMM(Time) ORDER BY (Name, Time, Age, ...); ► Column-Orient Model How we do CREATE TABLE demonstration.insert_view `HeartRate` UInt8, `Humidity` Float32, ... ) ENGINE = MergeTree() PARTITION BY toYYYYMM(Time) ORDER BY (Name, Time, Age, ...); ► Column-Orient Model How we do CPU : Intel Skylake 8 core Memory 'cpu-usage_user') AND ((created_at >= '2016-01-01 08:00:00') AND (created_at <= '2016-01-01 09:00:00')) ORDER BY toStartOfMinute(created_at) DESC LIMIT 5 ┌─value─┐ │ 4 │ │ 4 │ │ 4 │ │ 4 │ │

0 码力 | 42 页 | 911.10 KB | 1 年前
3
1. Machine Learning with ClickHouse

SAMPLE x OFFSET y CREATE TABLE trips_sample_time ( pickup_datetime DateTime ) ENGINE = MergeTree ORDER BY sipHash64(pickup_datetime) -- Primary Key SAMPLE BY sipHash64(pickup_datetime) -- expression for total_amount, trip_distance, (toYear(pickup_datetime) - 2009) * (trip_distance + 1)) FROM trips WHERE <...> ORDER BY sipHash64(trip_id) ASC [2.138706869701764,0.25152600248358253,4.5418692076782445] That’s better as aggregate function state in a separate table Example CREATE TABLE models ENGINE = MergeTree ORDER BY tuple() AS SELECT stochasticLinearRegressionState(total_amount, trip_distance) AS model FROM

0 码力 | 64 页 | 1.38 MB | 1 年前
3
0. Machine Learning with ClickHouse

SAMPLE x OFFSET y CREATE TABLE trips_sample_time ( pickup_datetime DateTime ) ENGINE = MergeTree ORDER BY sipHash64(pickup_datetime) -- Primary Key SAMPLE BY sipHash64(pickup_datetime) -- expression for total_amount, trip_distance, (toYear(pickup_datetime) - 2009) * (trip_distance + 1)) FROM trips WHERE <...> ORDER BY sipHash64(trip_id) ASC [2.138706869701764,0.25152600248358253,4.5418692076782445] That’s better as aggregate function state in a separate table Example CREATE TABLE models ENGINE = MergeTree ORDER BY tuple() AS SELECT stochasticLinearRegressionState(total_amount, trip_distance) AS model FROM

0 码力 | 64 页 | 1.38 MB | 1 年前
3
4. ClickHouse在苏宁用户画像场景的实践

groupBitmapState Integer 聚合类 groupBitmapAnd groupBitmapOr groupBitmapXor 14 Bitmap应用示例 order_id order_date user_id product_id 1 2019-10-01 1 p1 2 2019-10-01 1 p2 3 2019-10-01 2 p1 2019-10-02 5 p1 8 2019-10-02 5 p2 一张简单的订单明细表 detail_order，如何计算用户的日留存？ 15 标签 SQL 大表join，count distinct 都比较慢，而且容易 OOM! Bitmap应用示例 order_date uv_bitmap 2019-10-01 {1,2,3} 2019-10-02 {3 5] • 新用户： day2 ANDNOT day1 = [4,5] • 流失用户：day1 ANDNOT day2 = [1,2] 16 detail_order 聚合为天维度表留存用户的SQL Bitmap函数千万级用户，秒级出结果！ Contents 苏宁如何使用ClickHouse ClickHouse集成Bitmap

0 码力 | 32 页 | 1.47 MB | 1 年前
3
2. Clickhouse玩转每天千亿数据-趣头条

1：机器的内存推荐128G+ 2：采用软连接的方式，把不同的表分布到不同的盘上面，这样一台机器可以挂载更多的盘最新版本的”冷热数据分离”特性，曲线救国? 我们遇到的问题 order by (timestamp, eventType) or order by (eventType, timestamp) 业务场景 1：趣头条和米读的上报数据是按照”事件类型”(eventType)进行区分 2：指标系统分”分时”和”累时”指标 table where dt='' and timestamp>='' and timestamp<='' and eventType='' 建表的时候缺乏深度思考，由于分时指标的特性，我们的表是order by (timestamp, eventType)进行索引的，这样在计算累时指标的时候出现非常耗时(600亿+数据量) 分析：对于累时数据，时间索引基本就失效了，由于timestamp”基 from table where column=value select column1, column2 from table where column=value 凡是涉及group by, order by, distinct, join这样的SQL内存占用不再是O(1) 解决： 1：max_bytes_before_external_group_by 2：max_bytes_before_external_sort

0 码力 | 14 页 | 1.10 MB | 1 年前
3
2. 腾讯 clickhouse实践 _2019丁晓坤&熊峰

GROUP BY key ORDER BY value DESC LIMIT 10 SELECT play_times_key AS key, sum(play_times_value) AS value FROM wegame ARRAY JOIN play_times_key, play_times_value GROUP BY key ORDER BY value DESC

0 码力 | 26 页 | 3.58 MB | 1 年前
3
2. ClickHouse MergeTree原理解析-朱凯

expr] [ORDER BY expr] [PRIMARY KEY expr] [SAMPLE BY expr] [SETTINGS name=value, 省略...] 分区键排序键主键 index_granularity = 8192 索引粒度 MergeTree的存储结构数据以分区的形式被组织 , PARTITION BY 各列独立存储, 按ORDER BY 排序

0 码力 | 35 页 | 13.25 MB | 1 年前
3
Что нужно знать об архитектуре ClickHouse, чтобы его эффективно использовать

count(*) AS count FROM hits WHERE CounterID = 1234 AND Date >= today() - 7 GROUP BY Referer ORDER BY count DESC LIMIT 10 Типичный запрос в системе веб-аналитики Быстро читаем › Только нужные столбцы:

0 码力 | 28 页 | 506.94 KB | 1 年前
3

共 15 条前往

页

分类

语言

格式

5. ClickHouse at Ximalaya for Shanghai Meetup 2019 PDF

ClickHouse in Production

8. Continue to use ClickHouse as TSDB

1. Machine Learning with ClickHouse

0. Machine Learning with ClickHouse

4. ClickHouse在苏宁用户画像场景的实践

2. Clickhouse玩转每天千亿数据-趣头条

2. 腾讯 clickhouse实践 _2019丁晓坤&熊峰

2. ClickHouse MergeTree原理解析-朱凯

Что нужно знать об архитектуре ClickHouse, чтобы его эффективно использовать