2017-03-06
§
|
10:36 |
<gehel> |
upgrade to elasticsearch 5.2.2 on relforge cluster - T156150 |
[production] |
10:24 |
<elukey> |
(shamefully) replaced /etc/init.d/hadoop-hdfs-datanode script with "exit 0" to prevent the HDFS datanode daemon to start on analytics1028 (broken disk) and leave the rest running (puppet included) - T159632 |
[production] |
10:12 |
<gehel> |
postgresql upgrade on maps* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4) |
[production] |
10:06 |
<ariel@tin> |
Finished deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var (duration: 00m 01s) |
[production] |
10:06 |
<ariel@tin> |
Started deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var |
[production] |
09:46 |
<gehel> |
postgresql upgrade on maps-test* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4) |
[production] |
09:14 |
<ariel@tin> |
Finished deploy [dumps/dumps@04794df]: move default config into a file and clean up (duration: 00m 02s) |
[production] |
09:14 |
<ariel@tin> |
Started deploy [dumps/dumps@04794df]: move default config into a file and clean up |
[production] |
09:09 |
<gehel> |
killing stuck tilerator notification on maps-test2001 - T145534 |
[production] |
07:22 |
<marostegui> |
Resume pt-table-checksum on plwiki (s2) - T154485 |
[production] |
06:59 |
<marostegui> |
Deploy ALTER table on db2046 (s6) for the revision table - T159414 |
[production] |
06:46 |
<marostegui@tin> |
Synchronized wmf-config/db-codfw.php: Depool db2046 - T159414 (duration: 00m 51s) |
[production] |
02:24 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Mon Mar 6 02:24:24 UTC 2017 (duration 5m 19s) |
[production] |
02:19 |
<l10nupdate@tin> |
scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 15s) |
[production] |
01:29 |
<cwd> |
updated staging civicrm database and triggers |
[production] |
2017-03-04
§
|
16:43 |
<Reedy> |
Manually generating even more captchas (going upto 10k total) in screen as reedy on terbium T159581 |
[production] |
16:35 |
<Reedy> |
Manually generating some more captchas T159581 |
[production] |
03:28 |
<legoktm> |
pausing refreshLinks.php run due to increase in job queue |
[production] |
03:05 |
<mutante> |
planet2001 - and this time it just worked and i can't reproduce the issue. install finished. re-adding to puppet, signing certs... |
[production] |
03:00 |
<mutante> |
planet2001 - reinstalling once more (T159432) |
[production] |
02:36 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Sat Mar 4 02:36:25 UTC 2017 (duration 5m 19s) |
[production] |
02:31 |
<l10nupdate@tin> |
scap sync-l10n completed (1.29.0-wmf.14) (duration: 12m 10s) |
[production] |
00:52 |
<mutante> |
conf2002 - ran "systemctl reset-failed" to fix Icinga alert about broken systemd state due to formerly existing but failed service etcdmirror-eqiad-wmnet. turns out you need this to remove missing units. found on http://serverfault.com/questions/606520/how-to-remove-missing-systemd-units (T131959) |
[production] |
2017-03-03
§
|
23:23 |
<RainbowSprinkles> |
phabricator: restarted apache 1 last time, removed hack |
[production] |
23:19 |
<mutante> |
icinga: for special external hosts benefactorevents and eventdonations, "submit passive check result for this host" -> "check_tcp -p 80" to avoid "crit hosts" that just don't respond to ICMP (http://www.htmlgraphic.com/nagios-check-host-without-ping/) |
[production] |
23:12 |
<RainbowSprinkles> |
phabricator: restarting apache real quick |
[production] |
22:03 |
<hashar> |
rebooting contint2001 |
[production] |
21:54 |
<hashar> |
restarting Jenkins |
[production] |
21:51 |
<hashar> |
enabling puppet on contint1001 and puppet-run |
[production] |
21:05 |
<hashar> |
disabled puppet on contint1001 |
[production] |
20:26 |
<mattflaschen@tin> |
Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 00m 40s) |
[production] |
19:35 |
<ebernhardson> |
restart elasticsearch on relforge1002 to update remote reindex whitelist |
[production] |
19:33 |
<ebernhardson> |
restart elasticsearch on relforge1001 to update remote reindex whitelist |
[production] |
19:11 |
<legoktm> |
running refreshLinks.php across small wikis |
[production] |
18:43 |
<addshore@tin> |
Synchronized php-1.29.0-wmf.14/extensions/RevisionSlider/modules/ext.RevisionSlider.css: T159428 [[gerrit:340794|Quick fix for misplaced tooltips on RTL wikis]] (duration: 00m 42s) |
[production] |
17:34 |
<hashar> |
CI is mostly recovered. It could not spawn instance anymore. The queue is being processed and will take a while to be completed. Check status on https://integration.wikimedia.org/zuul/ | T159543 |
[production] |
16:17 |
<hashar> |
Stopped Jenkins from processing builds while instances are being recycled |
[production] |
13:37 |
<marostegui@tin> |
Synchronized wmf-config/db-codfw.php: Repool db2067 - T159414 (duration: 00m 50s) |
[production] |
13:12 |
<elukey> |
removed apache2 (rc state) and apache2-utils from analtytics1027 |
[production] |
11:11 |
<elukey@tin> |
Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 00m 14s) |
[production] |
11:11 |
<elukey@tin> |
Started deploy [analytics/refinery@1440646]: (no justification provided) |
[production] |
11:09 |
<elukey@tin> |
Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 00m 02s) |
[production] |
11:09 |
<elukey@tin> |
Started deploy [analytics/refinery@1440646]: (no justification provided) |
[production] |
11:05 |
<jynus> |
stopping mariadb and restarting db1051 for maintenance |
[production] |
11:03 |
<joal@tin> |
Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 01m 23s) |
[production] |
11:02 |
<joal@tin> |
Started deploy [analytics/refinery@1440646]: (no justification provided) |
[production] |