2014-08-19
§
|
16:11 |
<hashar> |
deleted /usr/local/apache/common-local symlink, made it a directory and retriggered https://integration.wikimedia.org/ci/job/beta-scap-eqiad/17887/console |
[releng] |
16:03 |
<bd808> |
Removed local changes to /usr/local/apache/conf/wmflabs-logging.conf on deployment-mediawiki02; logs back to nfs share |
[releng] |
15:52 |
<bd808> |
Changed apache logging level from debug to notice on deployment-mediawiki02 in /usr/local/apache/conf/wmflabs-logging.conf |
[releng] |
15:47 |
<bd808> |
Changed apache logging level from debug to warn on deployment-mediawiki02 |
[releng] |
15:44 |
<bd808> |
/var full on deployment-mediawiki02; deleting 572M /var/log/apache2/debug.log.1 |
[releng] |
15:03 |
<hashar> |
Killed some stalled scap / rsync process on deployment-bastion that were preventing https://integration.wikimedia.org/ci/job/beta-scap-eqiad/ from acquiring the lock. |
[releng] |
14:17 |
<hashar> |
huge rsync in progress on bastion |
[releng] |
14:00 |
<hashar> |
On bastion reverted the symlink on bastion and manually created directory /usr/local/apache/common-local |
[releng] |
13:55 |
<hashar_> |
On bastion, deleting /usr/local/apache/common-local and symlink it to /srv/common-local |
[releng] |
2014-08-18
§
|
22:22 |
<^d> |
dropped apache01/02 instances, unused and need the resources |
[releng] |
18:23 |
<manybubbles> |
finished upgrading elasticsearch in beta - everything seems ok so far |
[releng] |
18:15 |
<bd808> |
Restarted salt-minion on deployment-mediawiki01 & deployment-rsync01 |
[releng] |
18:15 |
<bd808> |
Ran `sudo pkill python` on deployment-rsync01 to kill hundreds of grain-ensure processes |
[releng] |
18:12 |
<bd808> |
Ran `sudo pkill python` on deployment-mediawiki01 to kill hundreds of grain-ensure processes |
[releng] |
18:10 |
<manybubbles> |
finally restarting beta's elasticsearch servers now that they have new jars |
[releng] |
17:56 |
<bd808> |
Manually ran trebuchet fetches on deployment-elastic0* |
[releng] |
17:49 |
<bd808> |
Forcing puppet run on deployment-elastic01 |
[releng] |
17:47 |
<godog> |
upgraded hhvm on mediawiki02 to 3.3-dev+20140728+wmf5 |
[releng] |
17:44 |
<bd808> |
Trying to restart minions again with `salt '*' -b 1 service.restart salt-minion` |
[releng] |
17:39 |
<bd808> |
Restarting minions via `salt '*' service.restart salt-minion` |
[releng] |
17:38 |
<bd808> |
Restarted salt-master service on deployment-salt |
[releng] |
17:19 |
<bd808> |
16:37 Restarted Apache and HHVM on deployment-mediawiki02 to pick up removal of /etc/php5/conf.d/mail.ini (logged in prod SAL by mistake) |
[releng] |
16:59 |
<manybubbles|lunc> |
upgrading Elasticsearch in beta to 1.3.2 |
[releng] |
16:11 |
<bd808> |
Manually applied https://gerrit.wikimedia.org/r/#/c/141287/12/templates/mail/exim4.minimal.erb on deployment-mediawiki02 and restarted exim4 service |
[releng] |
15:28 |
<bd808> |
Puppet failing for deployment-mathoid due to duplicate definition error in trebuchet config |
[releng] |
15:15 |
<bd808> |
Reinstated puppet patch to depool deployment-mediawiki01 and forced puppet run on all deployment-cache-* hosts |
[releng] |
15:04 |
<bd808> |
Puppet run failing on deployment-mediawiki01 (apache won't start); Puppet disabled on deployment-mediawiki02 ('reason not specified') Probably needs to wait until Giuseppe is back from vacation for fixing. |
[releng] |
15:00 |
<bd808> |
Rebooting deployment-eventlogging02 via wikitech; console filling with OOM killer messages and puppet runs failing with "Cannot allocate memory - fork(2)" |
[releng] |
14:29 |
<bd808> |
Forced puppet run on deployment-cache-upload02 |
[releng] |
14:27 |
<bd808> |
Forced puppet run on deployment-cache-text02 |
[releng] |
14:24 |
<bd808> |
Forced puppet run on deployment-cache-mobile03 |
[releng] |
14:20 |
<bd808> |
Forced puppet run on deployment-cache-bits01 |
[releng] |
2014-08-15
§
|
21:57 |
<legoktm> |
set $wgVERPsecret in PrivateSettings.php |
[releng] |
21:42 |
<hashSpeleology> |
Beta cluster database updates are broken due to CentralNotice. Fix up is {{gerrit|154231}} |
[releng] |
20:57 |
<hashSpeleology> |
deployment-rsync01 : deleting /usr/local/apache/common-local content. Then ln -s /srv/common-local /usr/local/apache/common-local as set by beta::common which is not applied on that host for some reason. {{bug|69590}} |
[releng] |
20:55 |
<hashSpeleology> |
puppet administratively disabled on mediawiki02 . Assuming some work in progress on that host. Leaving it untouched |
[releng] |
20:54 |
<hashSpeleology> |
puppet is proceeding on mediawiki01 |
[releng] |
20:52 |
<hashSpeleology> |
attempting to unbreak mediawiki code update {{bug|69590}} by cherry picking {{gerrit|154329}} |
[releng] |
20:39 |
<hashSpeleology> |
in case it is not in SAL. MediaWiki is no more synced to app server {{bug|69590}} |
[releng] |
20:20 |
<hashSpeleology> |
rebooting mediawiki01 , /var refuses to clear out and stick at 100% usage |
[releng] |
20:16 |
<hashSpeleology> |
cleaning up /var/log on deployment-mediawiki02 |
[releng] |
20:14 |
<hashSpeleology> |
on deployment-mediawiki01 deleting /var/log/apache2/access.log.1 |
[releng] |
20:13 |
<hashSpeleology> |
on deployment-mediawiki01 deleting /var/log/apache2/debug.log.1 |
[releng] |
20:13 |
<hashSpeleology> |
bunch of instances have a full /var/log :-/ |
[releng] |