3001-3050 of 10000 results (35ms)
2020-12-18 §
13:31 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1021.eqiad.wmnet with reason: REIMAGE [production]
13:31 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc2021.codfw.wmnet with reason: REIMAGE [production]
13:29 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc1021.eqiad.wmnet with reason: REIMAGE [production]
13:22 <marostegui> Compress clouddb1018:3312 clouddb1014:3312 T270473 [production]
10:59 <jynus> starting test swift backup of enwiki on a single thread towards dbstore2003 T264189 [production]
10:53 <jynus> returning db2102 to its original state [production]
10:52 <arturo> live-hacking local puppetmaster with https://gerrit.wikimedia.org/r/c/operations/puppet/+/650470 (T267966) [toolsbeta]
10:42 <arturo> updated facts from the tools project: `PUPPET_MASTER="tools-puppetmaster-02.eqiad.wmflabs" modules/puppet_compiler/files/compiler-update-facts` [puppet-diffs]
10:33 <dcaro> purging rbd snapshots for image fc6fb78b-4515-4dcc-8254-591b9fe01762 (T270478) [admin]
10:20 <hashar@deploy1001> Finished deploy [integration/docroot@1166384]: noop: clear out proper env variable in tests (duration: 00m 07s) [production]
10:20 <hashar@deploy1001> Started deploy [integration/docroot@1166384]: noop: clear out proper env variable in tests [production]
09:13 <marostegui> Compress clouddb1018:3317 clouddb1014:3317 T270473 [production]
08:26 <jynus> temporarily taking db2102 offline for mysql testing [production]
07:54 <elukey> on kafka-test10[08-10] - "ip addr flush dev ens5; systemctl restart ifup@ens5.service" [production]
07:37 <legoktm> reloaded zuul for https://gerrit.wikimedia.org/r/649745 [releng]
07:06 <marostegui> Stop mysql on db1124:3313 T268742 [production]
07:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove es1013 from dbctl T268436', diff saved to https://phabricator.wikimedia.org/P13600 and previous config saved to /var/cache/conftool/dbconfig/20201218-070235-marostegui.json [production]
07:00 <marostegui> Compress clouddb1019:3316 clouddb1015:3316 T270473 [production]
06:53 <marostegui> Compress clouddb1020:3315 clouddb1016:3315 T270473 [production]
01:34 <legoktm> restarted gerrit (T270451) [production]
01:33 <legoktm@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit1001.wikimedia.org with reason: OOM [production]
01:33 <legoktm@cumin1001> START - Cookbook sre.hosts.downtime for 0:10:00 on gerrit1001.wikimedia.org with reason: OOM [production]
01:27 <urbanecm@deploy1001> Synchronized php-1.36.0-wmf.22/includes/EditPage.php: 4c224bb88e968e885befd9e201ff96c29b976f11: SECURITY: Act like users dont exist if hidden from viewer (T120883) (duration: 00m 53s) [production]
01:22 <mutante> signing puppet certs and installing buster on doc1002/doc2001 with "insetup" role [production]
00:51 <urbanecm@deploy1001> Synchronized php-1.36.0-wmf.22/resources/src/vue/index.js: ed8212bfbe1854cc92a9f1cb33b5661cd0a8382c: Revert "vue: Log component errors" (duration: 00m 55s) [production]
00:30 <foks> reset email for Sutton12 [production]
00:23 <mutante> DNS - new project language 'nia' added - The Nias language is an Austronesian language spoken on Nias Island and the Batu Islands off the west coast of Sumatra in Indonesia. [production]
00:12 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: 3cc5caa12fc4143295560a375d6be70819e4daad: Undeploy graphoid for arwiki. Phase 4. (T270443) (duration: 00m 55s) [production]
00:11 <urbanecm@deploy1001> sync-file aborted: (no justification provided) (duration: 00m 00s) [production]
2020-12-17 §
23:41 <James_F> Zuul: Add Gerrit maintenance bot to whitelist T253439 [releng]
23:39 <dzahn@cumin1001> START - Cookbook sre.ganeti.makevm [production]
23:04 <jhuneidi@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' . [production]
22:51 <jhuneidi@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' . [production]
22:47 <jhuneidi@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . [production]
22:17 <andrewbogott> correction to above, set the pg and pgp to 1024 for eqiad1-glance-images [admin]
22:16 <andrewbogott> setting pgp number to 8192 for eqiad1-compute (a 4x increase) and 2048 for eqiad1-glance-images (also a 4x increase) T270305 (same as pg) [admin]
22:14 <andrewbogott> setting pg number to 8192 for eqiad1-compute (a 4x increase) and 2048 for eqiad1-glance-images (also a 4x increase) T270305 [admin]
22:10 <andrewbogott> setting autoscale to 'warn' for both ceph pools (eqiad1-compute and eqiad1-glance-images) [admin]
22:09 <dzahn@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
21:42 <bstorm> doing the same procedure to increase the timeouts more T267966 [tools]
21:06 <dzahn@cumin1001> START - Cookbook sre.ganeti.makevm [production]
21:06 <dzahn@cumin1001> END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) [production]
21:06 <dzahn@cumin1001> START - Cookbook sre.ganeti.makevm [production]
21:01 <ema> cp3052: ban 'req.http.host == "docker-registry.wikimedia.org"' T270270 [production]
20:59 <James_F> Zuul: [wikipeg] Run the composer-test-package suite T269720 [releng]
20:34 <James_F> Zuul: Add experimental PHP 8.0 jobs for OOUI and Wikipeg [releng]
20:24 <marxarelli> all wikis to 1.36.0-wmf.22 complete. no new errors or concerning rates (refs T267415) [production]
20:16 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.22 [production]
19:56 <bstorm> puppet enabled one at a time, letting things catch up. Timeouts are now adjusted to something closer to fsync values T267966 [tools]
19:44 <bstorm> set etcd timeouts seed value to 20 instead of the default 10 (profile::wmcs::kubeadm::etcd_latency_ms) T267966 [tools]