Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
A
Amazon-Selection-Data
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
abel_cjy
Amazon-Selection-Data
Commits
68c22cae
Commit
68c22cae
authored
Mar 25, 2026
by
wangjing
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'developer' of
http://47.106.101.75/abel_cjy/Amazon-Selection-Data
into developer
parents
7ad606c5
b16c59eb
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
8 additions
and
10 deletions
+8
-10
es_asin_profit_rate.py
Pyspark_job/export_es/es_asin_profit_rate.py
+8
-10
No files found.
Pyspark_job/export_es/es_asin_profit_rate.py
View file @
68c22cae
...
...
@@ -452,21 +452,19 @@ class EsAsinProfitRate(object):
# 8. 30day 索引额外更新 cate_flag 相关字段(partial update,不影响其他字段)
# inner join df_es 确保只更新索引中已存在的 asin,避免 doc missing 报错
# asin_source_flag:Hive 存为字符串,ES mapping 为 integer[],需转为 int 数组
if
base_date
is
None
:
print
(
f
"[30day] 开始更新 cate_flag 字段:asin_source_flag / bsr / nsr"
)
df_cate_update
=
self
.
df_cate_flag
.
join
(
df_es
.
select
(
'asin'
),
on
=
'asin'
,
how
=
'inner'
)
.
select
(
'asin'
,
'asin_source_flag'
,
'bsr_last_seen_at'
,
'bsr_seen_count_30d'
,
'nsr_last_seen_at'
,
'nsr_seen_count_30d'
)
.
na
.
fill
({
'asin_source_flag'
:
'0'
,
'bsr_last_seen_at'
:
'1970-01-01'
,
'bsr_seen_count_30d'
:
0
,
'nsr_last_seen_at'
:
'1970-01-01'
,
'nsr_seen_count_30d'
:
0
})
'asin'
,
F
.
transform
(
F
.
split
(
F
.
col
(
'asin_source_flag'
),
','
),
lambda
x
:
x
.
cast
(
'int'
))
.
alias
(
'asin_source_flag'
),
F
.
coalesce
(
F
.
col
(
'bsr_last_seen_at'
),
F
.
lit
(
'1970-01-01'
))
.
alias
(
'bsr_last_seen_at'
),
F
.
coalesce
(
F
.
col
(
'bsr_seen_count_30d'
)
.
cast
(
'int'
),
F
.
lit
(
0
))
.
alias
(
'bsr_seen_count_30d'
),
F
.
coalesce
(
F
.
col
(
'nsr_last_seen_at'
),
F
.
lit
(
'1970-01-01'
))
.
alias
(
'nsr_last_seen_at'
),
F
.
coalesce
(
F
.
col
(
'nsr_seen_count_30d'
)
.
cast
(
'int'
),
F
.
lit
(
0
))
.
alias
(
'nsr_seen_count_30d'
)
)
self
.
write_combined_update
(
df_cate_update
,
index_name
)
df_es
.
unpersist
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment