1-50 of 10000 results (31ms)
2021-05-27 ยง
23:56 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1004.eqiad.wmnet with reason: REIMAGE [production]
23:54 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on phab1004.eqiad.wmnet with reason: REIMAGE [production]
23:45 <thcipriani@deploy1002> Synchronized README: Config: [[gerrit:696713|Revert "README: deployment training"]] (duration: 00m 55s) [production]
23:38 <derick@deploy1002> Synchronized README: Config: [[gerrit:696706|README: deployment training]] (duration: 00m 55s) [production]
23:21 <egardner@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:693951|Enable MediaSearch Assessment filter (T276257)]] (duration: 00m 57s) [production]
23:14 <brennen> gitlab1001: gitlab-ctl stop nginx - pausing httpd for the weekend [releng]
22:06 <urbanecm> Invalidate bot password for `PKM@PKMbot` (T283839) [production]
21:53 <bstorm> added paws-k8s-control-2.paws.eqiad.wmflabs back to the list of control nodes at the proxy [paws]
21:50 <bstorm> renewed the certs for paws-k8s-control-2 [paws]
20:37 <jbond> add eugene-chernov, strofimovsky01, il to ldap nda #T279545 [production]
20:37 <bstorm> removed paws-k8s-control-2.paws.eqiad.wmflabs from the proxy because it is somewhat broken (certs expired) [paws]
20:37 <jbond> add eugene-chernov, strofimovsky01, il to ldap nda [production]
20:36 <brennen> gitlab1001: temporarily disabling backup cron jobs [releng]
19:53 <James_F> Manually create missing SecurePoll DB tables on mnwwiktionary, taywiki, and trvwiki for T283844 [production]
19:48 <legoktm@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
19:42 <majavah> bump quota to 3 services T283754 [tools.notwikilambda]
19:41 <bstorm> forced removal of openrefine in paws for now and deleted all current user server pods to force use of the new image [paws]
19:21 <brennen@deploy1002> rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.7 [production]
19:15 <tgr> US morning deploys done [production]
19:12 <tgr@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:695364|GrowthExperiments: Enable Add Links for 50% of new users and all old ones (T277356)]] (duration: 01m 04s) [production]
19:03 <tgr@deploy1002> Synchronized php-1.37.0-wmf.6/extensions/GrowthExperiments: Backport: [[gerrit:695833|Help panel: SwitchEditorPanel fixes (T282800)]] [[gerrit:695841|Avoid session loading when loading task types in help panel RL data (T282800)]] [[gerrit:696530|Add Link: Fix homepage PV token and newcomer task token logging (T283765)]] (duration: 01m 05s) [production]
18:57 <legoktm@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
18:56 <tgr@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:693208|ptwiki: Add 'flow-delete' to 'eliminator' user group (T283266)]] (duration: 01m 04s) [production]
18:49 <tgr@deploy1002> Synchronized php-1.37.0-wmf.7/extensions/GrowthExperiments: Backport: [[gerrit:695834|Help panel: SwitchEditorPanel fixes (T282800)]] [[gerrit:695842|Avoid session loading when loading task types in help panel RL data (T282800)]] [[gerrit:696527|Add Link: Fix homepage PV token and newcomer task token logging (T283765)]] (duration: 01m 06s) [production]
18:22 <legoktm@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
18:09 <tgr@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:696390|Enable Growth's community configuration on the pilot wikis (T283809)]] (duration: 01m 06s) [production]
18:03 <bstorm> adjusted profile::wmcs::kubeadm::etcd_latency_ms from 30 back to the default (10) [tools]
17:46 <legoktm> reloaded zuul for https://gerrit.wikimedia.org/r/696565 https://gerrit.wikimedia.org/r/696566 [releng]
17:26 <ryankemper@cumin2002> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
17:26 <ryankemper@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
17:23 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
17:20 <James_F> Running SecurePoll maintenance script cli/updateNotBlockedKey.php for all wikis T277079 [production]
17:18 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
17:05 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:59 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
16:52 <brennen> gitlab1001: ran gitlab-ctl start; logins now working; will add banner to effect that this is all provisional state [releng]
16:05 <brennen> gitlab1001: re-running ansible and puppet per T279545 [releng]
16:04 <bstorm> cleared error state from several exec node queues [tools]
15:58 <ryankemper> T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1007.eqiad.wmnet --dest wdqs1006.eqiad.wmnet --reason "transferring fresh wikidata journal following runaway inflation of wdqs1006's wikidata.jnl" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_disk` [production]
15:58 <ryankemper@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
15:56 <ryankemper> T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2004.codfw.wmnet --reason "transferring fresh wikidata journal following runaway inflation of wdqs2004's wikidata.jnl" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_disk` [production]
15:56 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-transfer [production]
15:53 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:50 <ryankemper> T280382 (fixing couple wrong host names in last log line) `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl`, and `wdqs2001`, `wdqs2002`, and `wdqs2008`, have a 975G `wikidata.jnl` [production]
15:49 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
15:44 <ryankemper> T280382 `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl`, and `wdqs2004` and `wdqs2001` have a 975G `wikidata.jnl`. It's not clear why there's such a big divergence [production]
15:41 <ryankemper> T280382 `wdqs2004` inexplicably has a 2.5TB `wikidata.jnl`. By comparison `wdqs1006` has a 1.6T `wikidata.jnl` [production]
15:12 <XioNoX> test netconf over ssh on cr3-ulsfo [production]
15:03 <effie> disable puppet mc2019 [production]
14:58 <wm-bot> Testing - cookbook ran by dcaro@vulcanus [admin]