2025-07-14
§
|
07:33 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2151 (T399249)', diff saved to https://phabricator.wikimedia.org/P78938 and previous config saved to /var/cache/conftool/dbconfig/20250714-073354-marostegui.json |
[production] |
07:21 |
<elukey@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/kartotherian: sync |
[production] |
07:20 |
<elukey@deploy1003> |
helmfile [codfw] START helmfile.d/services/kartotherian: sync |
[production] |
07:19 |
<elukey@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/kartotherian: sync |
[production] |
07:08 |
<elukey@deploy1003> |
helmfile [codfw] START helmfile.d/services/kartotherian: sync |
[production] |
06:54 |
<jmm@cumin2002> |
DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dale Zhou out of all services on: 2395 hosts |
[production] |
06:52 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2151 (T399249)', diff saved to https://phabricator.wikimedia.org/P78937 and previous config saved to /var/cache/conftool/dbconfig/20250714-065240-marostegui.json |
[production] |
06:52 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance |
[production] |
06:41 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1200 - Depool db1200.eqiad.wmnet to then clone it to db1207.eqiad.wmnet - marostegui@cumin1002 |
[production] |
06:41 |
<marostegui@cumin1002> |
START - Cookbook sre.mysql.depool db1200 - Depool db1200.eqiad.wmnet to then clone it to db1207.eqiad.wmnet - marostegui@cumin1002 |
[production] |
06:41 |
<marostegui@cumin1002> |
START - Cookbook sre.mysql.clone of db1200.eqiad.wmnet onto db1207.eqiad.wmnet |
[production] |
06:28 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance |
[production] |
06:23 |
<marostegui> |
Failover m1 from db1207 to db1213 - T399172 |
[production] |
06:15 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2160,2232].codfw.wmnet,db[1207,1213,1217].eqiad.wmnet with reason: Primary switchover m1 T399172 |
[production] |
05:50 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
05:49 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
05:45 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
05:44 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
05:40 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
05:39 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
05:35 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
05:34 |
<amastilovic@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply |
[production] |
2025-07-13
§
|
21:20 |
<wmbot~multichill@tools-bastion-12> |
Unable to add jobs, created T399417 |
[tools.multichill] |
18:04 |
<andrew@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2006-dev.codfw.wmnet with OS bookworm |
[production] |
17:43 |
<andrew@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage |
[production] |
17:36 |
<andrew@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage |
[production] |
17:17 |
<andrew@cumin2002> |
START - Cookbook sre.hosts.reimage for host cloudnet2006-dev.codfw.wmnet with OS bookworm |
[production] |
15:50 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed 3c977ccc7b (specify .python-version) |
[tools.lexeme-forms] |
15:45 |
<wmbot~soda@tools-bastion-13> |
soda built and uploaded a new version |
[tools.yapping-sodium] |
15:39 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed 168a4bf7cc (upgrade m3api) |
[tools.wd-image-positions] |
15:31 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed ebfaeef6e0 (specify .python-version) |
[tools.wd-image-positions] |
15:27 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed 81a627821b (Python 3.13 + Toolforge Build Service) |
[tools.wd-image-positions] |
15:27 |
<lucaswerkmeister> |
webservice stop && mv www{,-unused-tool-now-runs-on-buildservice} && wget https://gitlab.wikimedia.org/toolforge-repos/wd-image-positions/-/raw/81a627821b/service.template && webservice start |
[tools.wd-image-positions] |
14:48 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed b27c4c2d73 (read config from envvars) |
[tools.wd-image-positions] |
14:48 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
commented out config.yaml, should use envvars instead |
[tools.wd-image-positions] |
14:47 |
<lucaswerkmeister> |
python3 -c 'import yaml; print(yaml.safe_dump(yaml.safe_load(open("config.yaml"))["SECRET_KEY"]))' | toolforge envvars create TOOL_SECRET_KEY |
[tools.wd-image-positions] |
14:46 |
<lucaswerkmeister> |
python3 -c 'import yaml; print(yaml.safe_dump(yaml.safe_load(open("config.yaml"))["OAUTH"]["CONSUMER_SECRET"]))' | toolforge envvars create TOOL_OAUTH__CONSUMER_SECRET |
[tools.wd-image-positions] |
14:46 |
<lucaswerkmeister> |
python3 -c 'import yaml; print(yaml.safe_dump(yaml.safe_load(open("config.yaml"))["OAUTH"]["CONSUMER_KEY"]))' | toolforge envvars create TOOL_OAUTH__CONSUMER_KEY |
[tools.wd-image-positions] |
14:46 |
<lucaswerkmeister> |
disregard the previous message, wrong tool 🤦 |
[tools.lexeme-forms] |
14:46 |
<lucaswerkmeister> |
python3 -c 'import yaml; print(yaml.safe_dump(yaml.safe_load(open("config.yaml"))["OAUTH"]["CONSUMER_KEY"]))' | toolforge envvars create TOOL_OAUTH__CONSUMER_KEY |
[tools.lexeme-forms] |
14:43 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed b0af29d932 (change config keys to uppercase to work around T374780) |
[tools.wd-image-positions] |
14:11 |
<andrew@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2006-dev.codfw.wmnet with OS bookworm |
[production] |
13:51 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed 451794996c (add health-check-path) |
[tools.wd-image-positions] |
13:47 |
<andrew@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage |
[production] |
13:45 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed 99489081d3 (split dev requirements from prod requirements) |
[tools.wd-image-positions] |
13:42 |
<andrew@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage |
[production] |
13:24 |
<andrew@cumin2002> |
START - Cookbook sre.hosts.reimage for host cloudnet2006-dev.codfw.wmnet with OS bookworm |
[production] |
13:23 |
<andrew@cumin2002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudnet2006-dev.codfw.wmnet with OS bullseye |
[production] |
13:14 |
<andrew@cumin2002> |
START - Cookbook sre.hosts.reimage for host cloudnet2006-dev.codfw.wmnet with OS bullseye |
[production] |
12:02 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed d9d2273efb (upgrade dependencies) |
[tools.wd-image-positions] |