51-100 of 10000 results (132ms)
2024-07-16 ยง
19:18 <cdanis@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
19:17 <cdanis@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
19:15 <cdanis@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
19:07 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host dbproxy2008.codfw.wmnet with OS bookworm [production]
19:05 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2177 (T367781)', diff saved to https://phabricator.wikimedia.org/P66680 and previous config saved to /var/cache/conftool/dbconfig/20240716-190526-arnaudb.json [production]
19:05 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2177.codfw.wmnet with reason: Maintenance [production]
19:05 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2177.codfw.wmnet with reason: Maintenance [production]
19:05 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2156 (T367781)', diff saved to https://phabricator.wikimedia.org/P66679 and previous config saved to /var/cache/conftool/dbconfig/20240716-190504-arnaudb.json [production]
18:56 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2140 (T367856)', diff saved to https://phabricator.wikimedia.org/P66678 and previous config saved to /var/cache/conftool/dbconfig/20240716-185657-marostegui.json [production]
18:56 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance [production]
18:56 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance [production]
18:51 <cdanis@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
18:50 <cdanis@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
18:49 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P66677 and previous config saved to /var/cache/conftool/dbconfig/20240716-184956-arnaudb.json [production]
18:49 <cdanis@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
18:49 <cdanis@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
18:45 <pt1979@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dbproxy2007.codfw.wmnet with OS bookworm [production]
18:34 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P66675 and previous config saved to /var/cache/conftool/dbconfig/20240716-183449-arnaudb.json [production]
18:27 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host dbproxy2007.codfw.wmnet with OS bookworm [production]
18:19 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2156 (T367781)', diff saved to https://phabricator.wikimedia.org/P66674 and previous config saved to /var/cache/conftool/dbconfig/20240716-181942-arnaudb.json [production]
18:14 <dancy@deploy1002> rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.14 refs T366959 [production]
18:00 <dancy@deploy1002> Installing scap version "4.92.0" for 232 hosts [production]
17:59 <otto@deploy1002> Finished deploy [analytics/refinery@f97900c]: Deploy refinery with refinery-source version 0.2.44 for mw on k8s - take 3 [analytics/refinery@f97900c9] (duration: 00m 47s) [production]
17:58 <otto@deploy1002> Started deploy [analytics/refinery@f97900c]: Deploy refinery with refinery-source version 0.2.44 for mw on k8s - take 3 [analytics/refinery@f97900c9] [production]
17:58 <otto@deploy1002> Finished deploy [analytics/refinery@f97900c]: Deploy refinery with refinery-source version 0.2.44 for mw on k8s - take 2 [analytics/refinery@f97900c9] (duration: 02m 44s) [production]
17:58 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2156 (T367781)', diff saved to https://phabricator.wikimedia.org/P66672 and previous config saved to /var/cache/conftool/dbconfig/20240716-175820-arnaudb.json [production]
17:58 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance [production]
17:58 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 8:00:00 on db2186.codfw.wmnet with reason: Maintenance [production]
17:57 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2156.codfw.wmnet with reason: Maintenance [production]
17:57 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2156.codfw.wmnet with reason: Maintenance [production]
17:57 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2149 (T367781)', diff saved to https://phabricator.wikimedia.org/P66671 and previous config saved to /var/cache/conftool/dbconfig/20240716-175742-arnaudb.json [production]
17:55 <otto@deploy1002> Started deploy [analytics/refinery@f97900c]: Deploy refinery with refinery-source version 0.2.44 for mw on k8s - take 2 [analytics/refinery@f97900c9] [production]
17:55 <otto@deploy1002> Finished deploy [analytics/refinery@f97900c]: Deploy refinery with refinery-source version 0.2.44 for mw on k8s [analytics/refinery@f97900c9] (duration: 08m 33s) [production]
17:55 <cdanis@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
17:53 <cdanis@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
17:47 <otto@deploy1002> Started deploy [analytics/refinery@f97900c]: Deploy refinery with refinery-source version 0.2.44 for mw on k8s [analytics/refinery@f97900c9] [production]
17:47 <otto@deploy1002> Finished deploy [analytics/refinery@f97900c] (hadoop-test): Deploy refinery with refinery-source version 0.2.44 for mw on k8s - TEST [analytics/refinery@f97900c9] (duration: 03m 23s) [production]
17:46 <swfrench-wmf> appservers-rw and api-rw now resolve to failoid - T367949 [production]
17:44 <otto@deploy1002> Started deploy [analytics/refinery@f97900c] (hadoop-test): Deploy refinery with refinery-source version 0.2.44 for mw on k8s - TEST [analytics/refinery@f97900c9] [production]
17:44 <swfrench@cumin2002> conftool action : set/pooled=false; selector: dnsdisc=api-rw,name=eqiad [reason: Depooling ahead of turndown - T367949] [production]
17:43 <swfrench@cumin2002> conftool action : set/pooled=false; selector: dnsdisc=appservers-rw,name=eqiad [reason: Depooling ahead of turndown - T367949] [production]
17:42 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P66670 and previous config saved to /var/cache/conftool/dbconfig/20240716-174235-arnaudb.json [production]
17:40 <swfrench@cumin2002> conftool action : set/pooled=false; selector: dnsdisc=api-ro,name=codfw [reason: Depooling ahead of turndown - T367949] [production]
17:39 <swfrench@cumin2002> conftool action : set/pooled=false; selector: dnsdisc=appservers-ro,name=codfw [reason: Depooling ahead of turndown - T367949] [production]
17:27 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P66669 and previous config saved to /var/cache/conftool/dbconfig/20240716-172727-arnaudb.json [production]
17:14 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy2006.codfw.wmnet with OS bookworm [production]
17:14 <pt1979@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" [production]
17:12 <pt1979@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" [production]
17:12 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2149 (T367781)', diff saved to https://phabricator.wikimedia.org/P66668 and previous config saved to /var/cache/conftool/dbconfig/20240716-171220-arnaudb.json [production]
17:00 <mutante> lists2001 - systemctl reset-failed after gerrit:1054610 to fix T370098 [production]