2023-04-24
§
|
08:32 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 26 hosts with reason: Enabling replication T335266 |
[production] |
08:32 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 0:15:00 on 26 hosts with reason: Enabling replication T335266 |
[production] |
08:29 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 27 hosts with reason: Enabling replication T335266 |
[production] |
08:28 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 0:15:00 on 27 hosts with reason: Enabling replication T335266 |
[production] |
08:28 |
<marostegui> |
Enable replication eqiad -> codfw on s6 dbmaint eqiad T335266 |
[production] |
08:27 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 27 hosts with reason: Enabling replication T335266 |
[production] |
08:26 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 0:15:00 on 27 hosts with reason: Enabling replication T335266 |
[production] |
08:26 |
<marostegui> |
Enable replication eqiad -> codfw on s2 dbmaint eqiad T335266 |
[production] |
08:25 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.dhcp for host an-worker1110.eqiad.wmnet |
[production] |
08:21 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware |
[production] |
08:21 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware |
[production] |
08:20 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 10 hosts with reason: Enabling replication T335266 |
[production] |
08:20 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 0:15:00 on 10 hosts with reason: Enabling replication T335266 |
[production] |
08:20 |
<marostegui> |
Enable replication eqiad -> codfw on x1 dbmaint eqiad T335266 |
[production] |
08:18 |
<cgoubert@deploy2002> |
Started scap: testing T329857 |
[production] |
08:17 |
<marostegui> |
Enable replication eqiad -> codfw on es5 dbmaint eqiad T335266 |
[production] |
08:14 |
<claime> |
Deploying 909302 on deploy2002 for T329857 |
[production] |
08:10 |
<claime> |
Disabling puppet on deploy2002 - T329857 |
[production] |
08:09 |
<claime> |
Deploying 909302 on deploy1002 for T329857 |
[production] |
08:08 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 6 hosts with reason: Enabling replication T335266 |
[production] |
08:08 |
<marostegui> |
Enable replication eqiad -> codfw on es4 dbmaint eqiad T335266 |
[production] |
08:08 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 0:15:00 on 6 hosts with reason: Enabling replication T335266 |
[production] |
08:07 |
<marostegui> |
Enable replication eqiad -> codfw on pc3 dbmaint eqiad T335266 |
[production] |
08:06 |
<marostegui> |
Enable replication eqiad -> codfw on pc2 dbmaint eqiad T335266 |
[production] |
08:05 |
<marostegui> |
Enable replication eqiad -> codfw on pc1 dbmaint eqiad T335266 |
[production] |
07:53 |
<mvernon@cumin2002> |
END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.41 in codfw |
[production] |
07:51 |
<mvernon@cumin2002> |
START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.41 in codfw |
[production] |
07:45 |
<jelto@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1004.wikimedia.org with OS bullseye |
[production] |
07:44 |
<mvernon@cumin2002> |
END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.59 in codfw |
[production] |
07:42 |
<mvernon@cumin2002> |
START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.59 in codfw |
[production] |
07:39 |
<dcausse> |
restarting blazegraph on wdqs1005 (stuck for 3+days) |
[production] |
07:38 |
<mvernon@cumin2002> |
END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.4a in codfw |
[production] |
07:36 |
<mvernon@cumin2002> |
START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.4a in codfw |
[production] |
07:24 |
<jelto@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage |
[production] |
07:21 |
<jelto@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage |
[production] |
07:06 |
<jelto@cumin2002> |
START - Cookbook sre.hosts.reimage for host gitlab1004.wikimedia.org with OS bullseye |
[production] |
2023-04-22
§
|
21:50 |
<hashar> |
Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/908941 |
[releng] |
21:47 |
<hashar> |
Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/909624 |
[releng] |
19:35 |
<Krinkle> |
Create excimer_ui_server db user in Beta Cluster on deployment-db13 based on prod grants. Password stored in deployment-puppetmaster04:/var/lib/git/labs/private/ (passwords::excimer_ui_server::$excimer_db_pass). Picked db13 because recommendationapi is here (also prod m2), and because mdb (created for this purpose originally) appears broken since several OS iterations (likely forgotten due to unusual name) T301637, T331956 |
[releng] |
18:05 |
<Krinkle> |
Fix database hostname dns error at https://performance.wikimedia.beta.wmflabs.org/xhgui/, switch from mdb01 to mdb02, ref T301637 |
[releng] |
18:00 |
<Krinkle> |
Move deployment-xhgui03 config from to deployment-xhgui prefix, https://gerrit.wikimedia.org/g/cloud/instance-puppet/+/1d03d4cf84a148bff7e055d9939c44f30c618d85/deployment-prep/deployment-xhgui03.deployment-prep.eqiad1.wikimedia.cloud.yaml https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/ab1c9d2968ba3c292b81ba76f8c0775151630f94%5E%21/, ref T301637 |
[releng] |
16:18 |
<wm-bot> |
<lucaswerkmeister> deployed a074fd9c64 (trim spaces) |
[tools.lexeme-forms] |
15:39 |
<wm-bot> |
<lucaswerkmeister> deployed fdb0552957 (remove spaces) |
[tools.lexeme-forms] |
15:23 |
<wm-bot> |
<lucaswerkmeister> Double IRC messages to other bridges |
[tools.bridgebot] |
15:21 |
<wm-bot> |
<lucaswerkmeister> deployed b6a1268b21 (Punjabi nouns) |
[tools.lexeme-forms] |