1701-1750 of 10000 results (95ms)
2022-09-15 §
06:47 <marostegui@cumin1001> dbctl commit (dc=all): 'Give some weight to db2096 T317842', diff saved to https://phabricator.wikimedia.org/P34747 and previous config saved to /var/cache/conftool/dbconfig/20220915-064750-marostegui.json [production]
06:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2115 T317842', diff saved to https://phabricator.wikimedia.org/P34746 and previous config saved to /var/cache/conftool/dbconfig/20220915-064635-marostegui.json [production]
06:45 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db2096 to x1 primary and set section read-write T317842', diff saved to https://phabricator.wikimedia.org/P34745 and previous config saved to /var/cache/conftool/dbconfig/20220915-064525-root.json [production]
06:44 <marostegui> Starting x1 codfw failover from db2115 to db2096 - T317842 [production]
06:40 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T317842 [production]
06:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2096 with weight 0 T317842', diff saved to https://phabricator.wikimedia.org/P34744 and previous config saved to /var/cache/conftool/dbconfig/20220915-064014-root.json [production]
06:40 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T317842 [production]
06:35 <marostegui@cumin1001> dbctl commit (dc=all): 'db2105 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34743 and previous config saved to /var/cache/conftool/dbconfig/20220915-063538-root.json [production]
06:14 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2105 T317839', diff saved to https://phabricator.wikimedia.org/P34742 and previous config saved to /var/cache/conftool/dbconfig/20220915-061421-root.json [production]
06:13 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db2127 to s3 codfw T317839', diff saved to https://phabricator.wikimedia.org/P34741 and previous config saved to /var/cache/conftool/dbconfig/20220915-061317-marostegui.json [production]
06:12 <marostegui> Starting s3 codfw failover from db2105 to db2127 - T317839 [production]
06:03 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2127 with weight 0 T317839', diff saved to https://phabricator.wikimedia.org/P34740 and previous config saved to /var/cache/conftool/dbconfig/20220915-060307-root.json [production]
06:02 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Codfw switchover s3 T317839 [production]
06:02 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Codfw switchover s3 T317839 [production]
05:32 <marostegui@cumin1001> END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
05:32 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
05:12 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
05:12 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
2022-09-14 §
22:08 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1190 (T314041)', diff saved to https://phabricator.wikimedia.org/P34739 and previous config saved to /var/cache/conftool/dbconfig/20220914-220822-ladsgroup.json [production]
22:08 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance [production]
22:08 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance [production]
22:08 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
22:07 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
22:07 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1149 (T314041)', diff saved to https://phabricator.wikimedia.org/P34738 and previous config saved to /var/cache/conftool/dbconfig/20220914-220744-ladsgroup.json [production]
21:52 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P34737 and previous config saved to /var/cache/conftool/dbconfig/20220914-215238-ladsgroup.json [production]
21:38 <dduvall@deploy1002> Finished deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002 (duration: 01m 48s) [production]
21:37 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P34736 and previous config saved to /var/cache/conftool/dbconfig/20220914-213732-ladsgroup.json [production]
21:37 <dduvall@deploy1002> Started deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002 [production]
21:36 <dduvall> testing phabricator deployment to phab2002. should have no production impact (not serving traffic, no access to r/w db) [production]
21:35 <dduvall@deploy1002> Installation of scap version "4.19.1" completed for 561 hosts [production]
21:35 <dduvall@deploy1002> Installing scap version "4.19.1" for 561 hosts [production]
21:34 <dduvall> Deploying scap 4.19.1 (https://gerrit.wikimedia.org/r/c/mediawiki/tools/scap/+/832297/1/changelog) [production]
21:22 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1149 (T314041)', diff saved to https://phabricator.wikimedia.org/P34735 and previous config saved to /var/cache/conftool/dbconfig/20220914-212225-ladsgroup.json [production]
20:47 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
20:47 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
20:47 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
20:47 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
20:44 <dancy@deploy1002> Sync cancelled. [production]
20:44 <dancy@deploy1002> dancy: testing synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
20:44 <dancy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
20:44 <dancy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
20:40 <dancy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
20:40 <dancy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
20:39 <dancy@deploy1002> Started scap: testing [production]
20:38 <dancy@deploy1002> Synchronized php: group1 wikis to 1.40.0-wmf.1 refs T314190 (duration: 05m 49s) [production]
20:34 <dancy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
20:34 <dancy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
20:34 <dancy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
20:33 <dancy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
20:32 <dancy@deploy1002> rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.1 refs T314190 [production]