1051-1100 of 10000 results (35ms)
2021-07-27 §
05:32 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE [production]
05:12 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1162 T287230', diff saved to https://phabricator.wikimedia.org/P16899 and previous config saved to /var/cache/conftool/dbconfig/20210727-051212-marostegui.json [production]
2021-07-26 §
23:37 <legoktm@deploy1002> Synchronized php-1.37.0-wmf.15/extensions/Score/includes/Score.php: Increase lilypond version cache TTL to 1 hour (duration: 00m 57s) [production]
18:30 <cstone> SmashPig revision changed from be272c02ce to 020d4eccd4, [production]
17:41 <legoktm> ran `scap pull` and repooled mw2336.codfw.wmnet - T287394 [production]
17:41 <legoktm@cumin1001> conftool action : set/pooled=yes; selector: name=mw2336.codfw.wmnet [production]
17:40 <jynus@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1002.eqiad.wmnet with reason: REIMAGE [production]
17:38 <jynus@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1002.eqiad.wmnet with reason: REIMAGE [production]
16:06 <legoktm> depooled mw2336.codfw.mwnet, mgmt is down too. T287394 [production]
16:04 <legoktm@cumin1001> conftool action : set/pooled=no; selector: name=mw2336.codfw.wmnet [production]
15:29 <hashar> Restarted gerrit replica on gerrit2001.wikimedia.org # T287122 [production]
15:24 <ladsgroup@deploy1002> Synchronized php-1.37.0-wmf.15/extensions/AbuseFilter/includes/AbuseFilterHooks.php: Backport: [[gerrit:707021|Don’t generate current content text twice]], Part II (duration: 01m 49s) [production]
15:21 <ladsgroup@deploy1002> Synchronized php-1.37.0-wmf.15/extensions/AbuseFilter/includes/VariableGenerator/RunVariableGenerator.php: Backport: [[gerrit:707021|Don’t generate current content text twice]], Part I (duration: 01m 50s) [production]
15:19 <topranks> Adding peering to AS139931 - Bangladesh Submarine Cable Company - at Equinix Singapore on cr3-eqsin [production]
14:42 <dcausse@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . [production]
13:42 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
10:55 <ladsgroup@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Disable DPL on ruwikinews (duration: 00m 27s) [production]
10:53 <ladsgroup@deploy1002> Scap failed!: 3/6 canaries failed their endpoint checks(https://en.wikipedia.org) [production]
10:52 <ladsgroup@deploy1002> Scap failed!: 2/6 canaries failed their endpoint checks(https://en.wikipedia.org) [production]
10:51 <jynus> deploying 10 second mw user query limit on s3 codfw replicas [production]
10:49 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db2149', diff saved to https://phabricator.wikimedia.org/P16895 and previous config saved to /var/cache/conftool/dbconfig/20210726-104953-marostegui.json [production]
10:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db2149', diff saved to https://phabricator.wikimedia.org/P16894 and previous config saved to /var/cache/conftool/dbconfig/20210726-104649-marostegui.json [production]
10:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db2149', diff saved to https://phabricator.wikimedia.org/P16893 and previous config saved to /var/cache/conftool/dbconfig/20210726-104613-marostegui.json [production]
10:38 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2149', diff saved to https://phabricator.wikimedia.org/P16892 and previous config saved to /var/cache/conftool/dbconfig/20210726-103847-marostegui.json [production]
10:33 <oblivian@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
09:55 <jgiannelos@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [production]
09:15 <XioNoX> rollback sampling for T286038 [production]
08:31 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet [production]
08:27 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet [production]
08:26 <jmm@cumin2002> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1001.eqiad.wmnet [production]
08:11 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet [production]
07:18 <_joe_> docker-image prune on deneb T287222 [production]
07:17 <_joe_> manage-production-images prune on deneb, T287222 [production]
07:08 <marostegui> Optimize dewiki.logging in eqiad (there will be lag) [production]
06:39 <moritzm> installing krb5 security updates [production]
05:55 <Amir1> start cleaning up auto-review flagged revs logs in plwiki [production]
2021-07-24 §
11:04 <urbanecm> [urbanecm@mwmaint2002 ~]$ mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=commonswiki --reason='OTRS -> VRTS renaming process; see [[Phab:T280392]] and [[Phab:T280397]]' --move-subpages 'Commons:OTRS' 'Commons:Volunteer Response Team' 'Martin Urbanec' # T287321 [production]
2021-07-23 §
19:11 <topranks> Successfully re-pooled eqiad - reversed change from yesterday after successful line card replacement in cr2-codfw - T287110 [production]
19:02 <topranks> De-pooling eqiad again after successful replacement of linecard in cr2-codfw T287110 [production]
18:26 <legoktm@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
18:24 <legoktm@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
18:14 <topranks> Turning up et-0/0/[0-1] and et-0/2/[0-1] interfaces on cr2-codfw after line card replacement slot 0. [production]
18:12 <legoktm@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' . [production]
16:15 <effie> enable puppet on mc-gp* hosts [production]
15:47 <papaul> powerdown wdqs2002 for IDRAC reset [production]
15:45 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
15:44 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
15:11 <elukey> stop ml-serve-ctrl1001 + gnt-instance modify -t plain ml-serve-ctrl1001.eqiad.wmnet on ganeti1009 + start instance back - T287238 [production]
14:36 <_joe_> rebuilding httpd-fcgi, mediawiki-http fixing logging T285384 [production]
14:16 <brennen> gitlab1001: running ansible to deploy [[gerrit:707236|fix puma exporter listen address]] (T275170) [production]