51-100 of 10000 results (53ms)
2022-08-25 ยง
20:17 <urbanecm> [urbanecm@deploy1002 ~]$ rm /var/lock/scap.operations_mediawiki-config.lock # connection to deploy1002 handled, to let me re-sync [production]
20:14 <urandom> re-rebooting ms-be2067 to "fix" disk enumeration(?) -- T314049 [production]
20:14 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
20:13 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
20:13 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
20:12 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
20:11 <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159 [production]
20:11 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33154 and previous config saved to /var/cache/conftool/dbconfig/20220825-201141-ladsgroup.json [production]
20:07 <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159 [production]
20:02 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P33153 and previous config saved to /var/cache/conftool/dbconfig/20220825-200250-ladsgroup.json [production]
19:56 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33152 and previous config saved to /var/cache/conftool/dbconfig/20220825-195635-ladsgroup.json [production]
19:47 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P33151 and previous config saved to /var/cache/conftool/dbconfig/20220825-194744-ladsgroup.json [production]
19:42 <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159 [production]
19:41 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33150 and previous config saved to /var/cache/conftool/dbconfig/20220825-194129-ladsgroup.json [production]
19:41 <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159 [production]
19:37 <andrew@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudservices1003 [production]
19:37 <andrew@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
19:36 <urandom> rebooting ms-be2067 to "fix" disk enumeration(?) -- T314049 [production]
19:35 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33149 and previous config saved to /var/cache/conftool/dbconfig/20220825-193513-ladsgroup.json [production]
19:35 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
19:34 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
19:34 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance [production]
19:34 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance [production]
19:34 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33148 and previous config saved to /var/cache/conftool/dbconfig/20220825-193430-ladsgroup.json [production]
19:33 <andrew@cumin1001> START - Cookbook sre.dns.netbox [production]
19:32 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33147 and previous config saved to /var/cache/conftool/dbconfig/20220825-193238-ladsgroup.json [production]
19:29 <andrew@cumin1001> START - Cookbook sre.hosts.decommission for hosts cloudservices1003 [production]
19:19 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33146 and previous config saved to /var/cache/conftool/dbconfig/20220825-191924-ladsgroup.json [production]
19:04 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33145 and previous config saved to /var/cache/conftool/dbconfig/20220825-190417-ladsgroup.json [production]
18:49 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33144 and previous config saved to /var/cache/conftool/dbconfig/20220825-184911-ladsgroup.json [production]
18:48 <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159 [production]
18:48 <ebernhardson@deploy1002> Finished deploy [wikimedia/discovery/analytics@d00af45]: bump elasticsearch-hadoop to 7.10.2 (duration: 02m 07s) [production]
18:47 <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159 [production]
18:45 <ebernhardson@deploy1002> Started deploy [wikimedia/discovery/analytics@d00af45]: bump elasticsearch-hadoop to 7.10.2 [production]
18:43 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33143 and previous config saved to /var/cache/conftool/dbconfig/20220825-184301-ladsgroup.json [production]
18:42 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance [production]
18:42 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance [production]
18:42 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33142 and previous config saved to /var/cache/conftool/dbconfig/20220825-184233-ladsgroup.json [production]
18:36 <otto@deploy1002> helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: sync [production]
18:36 <otto@deploy1002> helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: sync [production]
18:35 <otto@deploy1002> helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: sync [production]
18:34 <otto@deploy1002> helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: sync [production]
18:34 <otto@deploy1002> helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: sync [production]
18:33 <otto@deploy1002> helmfile [staging] START helmfile.d/services/eventgate-analytics-external: sync [production]
18:33 <ottomata> rolling restart of eventgate-analytics-external to pick up retroactive schema change for android schemas in T316047 [production]
18:27 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P33141 and previous config saved to /var/cache/conftool/dbconfig/20220825-182727-ladsgroup.json [production]
18:19 <dancy@deploy1002> rebuilt and synchronized wikiversions files: (no justification provided) [production]
18:18 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@5712187]: (no justification provided) (duration: 00m 09s) [production]
18:18 <bmansurov@deploy1002> Started deploy [airflow-dags/research@5712187]: (no justification provided) [production]
18:13 <dancy@deploy1002> Installation of scap version "4.15.0" completed for 557 hosts [production]