2021-05-27
§
|
15:56 |
<ryankemper> |
T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2004.codfw.wmnet --reason "transferring fresh wikidata journal following runaway inflation of wdqs2004's wikidata.jnl" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_disk` |
[production] |
15:56 |
<ryankemper@cumin2002> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
15:53 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:50 |
<ryankemper> |
T280382 (fixing couple wrong host names in last log line) `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl`, and `wdqs2001`, `wdqs2002`, and `wdqs2008`, have a 975G `wikidata.jnl` |
[production] |
15:49 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
15:44 |
<ryankemper> |
T280382 `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl`, and `wdqs2004` and `wdqs2001` have a 975G `wikidata.jnl`. It's not clear why there's such a big divergence |
[production] |
15:41 |
<ryankemper> |
T280382 `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl` |
[production] |
15:12 |
<XioNoX> |
test netconf over ssh on cr3-ulsfo |
[production] |
15:03 |
<effie> |
disable puppet mc2019 |
[production] |
14:14 |
<moritzm> |
bounce keyholder-agent on cumin2001 to drop homer key (now on 2002 only) |
[production] |
12:57 |
<tgr> |
T283606: running mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki={ar,bn,cs,vi}wiki --verbose --search-index with gerrit:696307 applied |
[production] |
12:55 |
<tgr> |
T283606: running mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki={ar,bn,cs,vi}wiki --verbose --search-index |
[production] |
12:50 |
<kormat@deploy1002> |
Synchronized wmf-config/db-eqiad.php: Repool pc1007 as pc1 master T282761 (duration: 01m 04s) |
[production] |
12:47 |
<tgr> |
EU deploys done |
[production] |
12:40 |
<tgr@deploy1002> |
Synchronized php-1.37.0-wmf.7/extensions/GrowthExperiments/: Backport: [[gerrit:695437|Add Link: Prevent double-opening of the post-edit dialog (T283120)]] [[gerrit:695479|Always delete from search index in AddLinkSubmissionHandler (T283606)]] (duration: 01m 06s) |
[production] |
12:40 |
<topranks> |
cr2-eqord: Gerrit 696383: Removing IPv4 Anycast ranges from bgp_out policy. |
[production] |
12:39 |
<tgr@deploy1002> |
Synchronized php-1.37.0-wmf.6/extensions/GrowthExperiments/: Backport: [[gerrit:695436|Add Link: Prevent double-opening of the post-edit dialog (T283120)]] [[gerrit:695437|Add Link: Prevent double-opening of the post-edit dialog (T283120)]] (duration: 01m 06s) |
[production] |
12:25 |
<tgr@deploy1002> |
Synchronized php-1.37.0-wmf.7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWTransclusionDialog.js: Backport: [[gerrit:695831|Don't update backButton visibility if not set (T283511)]] (duration: 01m 06s) |
[production] |
11:51 |
<tgr@deploy1002> |
Synchronized php-1.37.0-wmf.6/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWTransclusionDialog.js: Backport: [[gerrit:695832|Don't update backButton visibility if not set (T283511)]] (duration: 01m 06s) |
[production] |
10:27 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Rebuilding db2094:s8 from db2082 T283793 |
[production] |
10:26 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Rebuilding db2094:s8 from db2082 T283793 |
[production] |
10:23 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dborch1001.wikimedia.org with reason: Rebuilding db2094:s8 from db2082 12:19:41 <kormat> i thought also i might directly move pc1010 to pc2, so that it'll have a few days of pc2 cache available when we make it pc2 primary next week |
[production] |
10:23 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dborch1001.wikimedia.org with reason: Rebuilding db2094:s8 from db2082 12:19:41 <kormat> i thought also i might directly move pc1010 to pc2, so that it'll have a few days of pc2 cache available when we make it pc2 primary next week |
[production] |
09:46 |
<kormat> |
restarting mariadb on pc1007 to upgrade it |
[production] |
08:35 |
<topranks> |
removing stale peers (AS8674 / Netnod and AS57695 / Misaka) from cr2-esams |
[production] |
08:30 |
<moritzm> |
installing libx11 security updates |
[production] |
07:45 |
<topranks> |
cmooney@cumin1001 Gerrit 694305: Run homer to add Wikidough prefix aggregate config on cr's in AMS |
[production] |
07:44 |
<legoktm> |
adding stephane at kiwix as owner of offline-l per email |
[production] |
07:43 |
<topranks> |
cmooney@cumin1001 Gerrit 694305: Run homer to add Wikidough prefix aggregate config on cr's in eqsin |
[production] |
07:42 |
<topranks> |
cmooney@cumin1001 Gerrit 694305: Run homer to add Wikidough prefix aggregate config on cr2-eqord |
[production] |
07:20 |
<topranks> |
cmooney@cumin1001 Gerrit 694305: Run homer to announce Wikidough Anycast range from cr's in ulsfo |
[production] |
07:14 |
<topranks> |
cmooney@cumin1001 Gerrit 694305: Add Wikidough Anycast range to aggregate config to cr1-eqdfw |
[production] |
07:11 |
<topranks> |
cmooney@cumin1001 Gerrit 694305: Add Wikidough Anycast range to aggregate config to cr2-codfw |
[production] |
06:47 |
<ryankemper@puppetmaster2001> |
conftool action : set/pooled=no; selector: name=wdqs1003.eqiad.wmnet |
[production] |
06:43 |
<urbanecm@deploy1002> |
Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 13s) |
[production] |
06:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1148 (re)pooling @ 100%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16227 and previous config saved to /var/cache/conftool/dbconfig/20210527-060953-root.json |
[production] |
05:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1147', diff saved to https://phabricator.wikimedia.org/P16226 and previous config saved to /var/cache/conftool/dbconfig/20210527-055507-marostegui.json |
[production] |
05:54 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1148 (re)pooling @ 75%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16225 and previous config saved to /var/cache/conftool/dbconfig/20210527-055450-root.json |
[production] |
05:39 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1148 (re)pooling @ 50%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16224 and previous config saved to /var/cache/conftool/dbconfig/20210527-053946-root.json |
[production] |
05:29 |
<ryankemper> |
`ryankemper@cloudelastic1003:~$ sudo run-puppet-agent --force` |
[production] |
05:24 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1148 (re)pooling @ 25%: Repool db1148', diff saved to https://phabricator.wikimedia.org/P16223 and previous config saved to /var/cache/conftool/dbconfig/20210527-052442-root.json |
[production] |
2021-05-26
§
|
23:07 |
<ladsgroup@deploy1002> |
Synchronized php-1.37.0-wmf.7/includes/resourceloader/dependencystore/SqlModuleDependencyStore.php: Backport: [[gerrit:695325|resourceloader: Avoid primary connection in SqlModuleDependencyStore (2)]] (duration: 01m 06s) |
[production] |
23:03 |
<ladsgroup@deploy1002> |
Synchronized php-1.37.0-wmf.6/includes/resourceloader/dependencystore/SqlModuleDependencyStore.php: Backport: [[gerrit:695324|resourceloader: Avoid primary connection in SqlModuleDependencyStore (2)]] (duration: 01m 06s) |
[production] |
22:17 |
<ladsgroup@deploy1002> |
Synchronized php-1.37.0-wmf.7/includes/resourceloader/dependencystore/SqlModuleDependencyStore.php: Backport: [[gerrit:695321|resourceloader: Avoid opening a connection to master when not needed]] (duration: 01m 06s) |
[production] |
22:10 |
<ladsgroup@deploy1002> |
Synchronized php-1.37.0-wmf.6/includes/resourceloader/dependencystore/SqlModuleDependencyStore.php: Backport: [[gerrit:695320|resourceloader: Avoid opening a connection to master when not needed]] (duration: 01m 07s) |
[production] |
21:22 |
<tgr> |
T283606: running mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki={ar,bn,cs,vi}wiki --verbose --search-index |
[production] |
19:58 |
<twentyafterfour> |
finished deploying wmf.7 and error levels appear unchanged. refs T281148 |
[production] |
19:57 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1018.eqiad.wmnet with reason: REIMAGE |
[production] |
19:55 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1018.eqiad.wmnet with reason: REIMAGE |
[production] |
19:51 |
<twentyafterfour@deploy1002> |
Synchronized php: group1 wikis to 1.37.0-wmf.7 (duration: 01m 07s) |
[production] |